CRF
Home
Projects
BioTagger
MSTParser
CRAIG
StructLearn
CALO
EVIGAN
Publications
People
Computing
Links
Penn
Penn CIS
Machine Learning at Penn
Genomics and Computational Biology

Penn BioTagger

These pages describe briefly Penn's BioTagger software suite. Currently the tagger supports three types of entities: gene entities, genomic variations entities and malignancy type entities.

Please read the README file to learn about usage and input/output format.

Tagger

The core of the tagger is derived from the machine learning package MALLET.

References

These taggers are based on the work published in:

  1. Automated recognition of malignancy mentions in biomedical literature.
    Y. Jin, R. T. McDonald, K. Lerman, M. A. Mandel, S. Carroll, M. Y. Liberman, F. C. Pereira, R. S. Winters, and P. S. White
    BMC Bioinformatics  7  492  (2006)
    http://www.biomedcentral.com/1471-2105/7/492
  2. Identifying Gene and Protein Mentions in Text Using Conditional Random Fields
    R. McDonald and F. Pereira
    {BMC} Bioinformatics  6  S6  (2005)
    http://www.biomedcentral.com/1471-2105/6/S1/S6
  3. An entity tagger for recognizing acquired genomic variations in cancer literature
    R. T. McDonald, R. S. Winters, M. Mandel, Y. Jin, P. S. White, and F. Pereira
    Bioinformatics  20  3249 - 3251  (2004)
    http://bioinformatics.oupjournals.org/cgi/reprint/20/17/3249

Credits

Programming: Kevin Lerman, Yang Jin, Eric Pancoast and Ryan McDonald.

Research supported in part by the National Science Foundation under grant EIA-0205448 (Mining the Bibliome).