struct.sequence
Class SequenceDataManager

java.lang.Object
  extended by struct.sequence.SequenceDataManager
All Implemented Interfaces:
DataManager
Direct Known Subclasses:
ChunkerDataManager, InstanceListDataManager, NeDataManager, POSDataManager

public abstract class SequenceDataManager
extends java.lang.Object
implements DataManager

A manager to manage sequence data.

Version:
07/15/2006

Field Summary
 edu.umass.cs.mallet.base.types.Alphabet dataAlphabet
           
private static java.util.logging.Logger logger
           
 edu.umass.cs.mallet.base.types.Alphabet tagAlphabet
           
 
Constructor Summary
SequenceDataManager()
           
 
Method Summary
 void closeAlphabets()
          Stops alphabets' growths.
 void createAlphabets(java.lang.String file)
          Creates alphabets by reading from the file according to this data manager.
 void createAlphabets(java.lang.String file, boolean createUnsupported)
          Creates alphabets.
 void createCRF(java.util.LinkedList[] predicates, java.lang.String[] tags, java.io.BufferedWriter out)
          Writes the predicates and tags in the conll format for easy comparison with a Mallet CRF.
 SLFeatureVector createFeatureVector(java.util.LinkedList[] predicates, java.lang.String[] tags, SLFeatureVector fv)
           
 SLFeatureVector createFeatureVector(java.util.LinkedList predicates, java.lang.String next, SLFeatureVector fv)
           
 SLFeatureVector createFeatureVector(java.util.LinkedList predicates, java.lang.String prev, java.lang.String next, SLFeatureVector fv)
           
 void createForest(SequenceInstance inst, java.util.LinkedList[] predicates, java.io.ObjectOutputStream out)
           
private  void createTagAlphabet(java.lang.String file)
          Creates tag alphabet.
protected  void createU(java.util.LinkedList[] predicates)
           
 edu.umass.cs.mallet.base.types.Alphabet getDataAlphabet()
          Returns data alphabet.
abstract  java.util.LinkedList[] getPredicates(java.lang.String[] toks, java.lang.String[] pos)
           
 edu.umass.cs.mallet.base.types.Alphabet getTagAlphabet()
           
 java.lang.String normalize(java.lang.String s)
           
 SequenceInstance[] readData(java.lang.String file)
          Creates instances by reading from the file according to this data manager.
 SequenceInstance[] readData(java.lang.String file, boolean createFeatureFile)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

logger

private static java.util.logging.Logger logger

dataAlphabet

public edu.umass.cs.mallet.base.types.Alphabet dataAlphabet

tagAlphabet

public edu.umass.cs.mallet.base.types.Alphabet tagAlphabet
Constructor Detail

SequenceDataManager

public SequenceDataManager()
Method Detail

getDataAlphabet

public edu.umass.cs.mallet.base.types.Alphabet getDataAlphabet()
Description copied from interface: DataManager
Returns data alphabet.

Specified by:
getDataAlphabet in interface DataManager

getTagAlphabet

public edu.umass.cs.mallet.base.types.Alphabet getTagAlphabet()

readData

public SequenceInstance[] readData(java.lang.String file)
                            throws java.io.IOException
Description copied from interface: DataManager
Creates instances by reading from the file according to this data manager.

Specified by:
readData in interface DataManager
Throws:
java.io.IOException

readData

public SequenceInstance[] readData(java.lang.String file,
                                   boolean createFeatureFile)
                            throws java.io.IOException
Throws:
java.io.IOException

createAlphabets

public void createAlphabets(java.lang.String file)
                     throws java.io.IOException
Description copied from interface: DataManager
Creates alphabets by reading from the file according to this data manager.

Specified by:
createAlphabets in interface DataManager
Throws:
java.io.IOException

createAlphabets

public void createAlphabets(java.lang.String file,
                            boolean createUnsupported)
                     throws java.io.IOException
Creates alphabets.

Throws:
java.io.IOException

createU

protected void createU(java.util.LinkedList[] predicates)

createCRF

public void createCRF(java.util.LinkedList[] predicates,
                      java.lang.String[] tags,
                      java.io.BufferedWriter out)
               throws java.io.IOException
Writes the predicates and tags in the conll format for easy comparison with a Mallet CRF.

Throws:
java.io.IOException

createTagAlphabet

private void createTagAlphabet(java.lang.String file)
                        throws java.io.IOException
Creates tag alphabet.

Throws:
java.io.IOException

createFeatureVector

public SLFeatureVector createFeatureVector(java.util.LinkedList predicates,
                                           java.lang.String prev,
                                           java.lang.String next,
                                           SLFeatureVector fv)

createFeatureVector

public SLFeatureVector createFeatureVector(java.util.LinkedList predicates,
                                           java.lang.String next,
                                           SLFeatureVector fv)

createFeatureVector

public SLFeatureVector createFeatureVector(java.util.LinkedList[] predicates,
                                           java.lang.String[] tags,
                                           SLFeatureVector fv)

getPredicates

public abstract java.util.LinkedList[] getPredicates(java.lang.String[] toks,
                                                     java.lang.String[] pos)

closeAlphabets

public void closeAlphabets()
Description copied from interface: DataManager
Stops alphabets' growths.

Specified by:
closeAlphabets in interface DataManager

createForest

public void createForest(SequenceInstance inst,
                         java.util.LinkedList[] predicates,
                         java.io.ObjectOutputStream out)

normalize

public java.lang.String normalize(java.lang.String s)


Copyright (C) 2006 University of Pennsylvania.