struct.sequence
Class InstanceListDataManager

java.lang.Object
  extended by struct.sequence.SequenceDataManager
      extended by struct.sequence.InstanceListDataManager
All Implemented Interfaces:
DataManager

public class InstanceListDataManager
extends SequenceDataManager

A DataManager to manage Mallet Instances.

Version:
08/15/2006

Field Summary
private  edu.umass.cs.mallet.base.types.InstanceList iList
           
 
Fields inherited from class struct.sequence.SequenceDataManager
dataAlphabet, tagAlphabet
 
Constructor Summary
InstanceListDataManager(edu.umass.cs.mallet.base.types.InstanceList iList)
           
 
Method Summary
 void createAlphabets(boolean createUnsupported)
          Creates alphabets.
 void createForest(SequenceInstance inst, java.util.LinkedList[] predicates, java.io.ObjectOutputStream out)
           
private  java.util.LinkedList[] getPredicates(edu.umass.cs.mallet.base.types.FeatureVectorSequence fvs)
          Creates the predicates from a Mallet FeatureVectorSequence.
 java.util.LinkedList[] getPredicates(java.lang.String[] toks, java.lang.String[] pos)
           
 void grow(edu.umass.cs.mallet.base.types.InstanceList iList, boolean createUnsupported)
          Grows the data alphabet with new features.
 SequenceInstance[] readData(edu.umass.cs.mallet.base.types.InstanceList iList)
          Reads instances from a Mallet InstanceList and converts them into SequenceInstances.
 void setInstanceAlphabets()
          Sets the SequenceInstance's alphabets.
 
Methods inherited from class struct.sequence.SequenceDataManager
closeAlphabets, createAlphabets, createAlphabets, createCRF, createFeatureVector, createFeatureVector, createFeatureVector, createU, getDataAlphabet, getTagAlphabet, normalize, readData, readData
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

iList

private edu.umass.cs.mallet.base.types.InstanceList iList
Constructor Detail

InstanceListDataManager

public InstanceListDataManager(edu.umass.cs.mallet.base.types.InstanceList iList)
Parameters:
iList - The Mallet InstanceList
Method Detail

getPredicates

public java.util.LinkedList[] getPredicates(java.lang.String[] toks,
                                            java.lang.String[] pos)
Specified by:
getPredicates in class SequenceDataManager

getPredicates

private java.util.LinkedList[] getPredicates(edu.umass.cs.mallet.base.types.FeatureVectorSequence fvs)
Creates the predicates from a Mallet FeatureVectorSequence.


createAlphabets

public void createAlphabets(boolean createUnsupported)
                     throws java.io.IOException
Creates alphabets.

Parameters:
createUnsupported - whether to add unsupported features. This means whether to create a feature for some label if we have only ever seen it with another label. Adding these to the alphabet allows us to learn a weight for them. Adding unsupported features has been shown to help with CRF training.
Throws:
java.io.IOException

grow

public void grow(edu.umass.cs.mallet.base.types.InstanceList iList,
                 boolean createUnsupported)
Grows the data alphabet with new features.


setInstanceAlphabets

public void setInstanceAlphabets()
Sets the SequenceInstance's alphabets.


readData

public SequenceInstance[] readData(edu.umass.cs.mallet.base.types.InstanceList iList)
                            throws java.io.IOException
Reads instances from a Mallet InstanceList and converts them into SequenceInstances.

Throws:
java.io.IOException

createForest

public void createForest(SequenceInstance inst,
                         java.util.LinkedList[] predicates,
                         java.io.ObjectOutputStream out)
Overrides:
createForest in class SequenceDataManager


Copyright (C) 2006 University of Pennsylvania.