DBLP-2006 "Fernando Pereira" Dataset

Maintained by Ted Sandler.

This page lists 98 articles from the DBLP database which have been written by authors named "Fernando Pereira". In general, author names, such as "Fernando Pereira," are ambiguous since there exist many persons publishing under the same name. Current bibliographical databases such as the DBLP, Citeseer, or Google Scolar provide no facilities to search by *person*. One can only search by name. This is an inconvenience since one usually wants to find articles by a particular individual, not articles by individuals with a particular name. We are currently investigating machine learning methods to facilitate such queries. Specifically, we are trying to learn similarity functions on database records from training data that specifies which records records are similar and which are dissimilar. The "Fernando Pereira" dataset is one such training set. Each colored group of articles represents a set of articles we believe to have been written by the same author. Thus articles in the same groups are considered more similar to each other than to articles in other groups.

However, in creating this training set, we may have made some mistakes. That is, we may have mistakenly grouped papers written by different authors or failed to group papers by the same author. If you find any mistakes in this data, please let us know by email.

Go back to entity resolution datasets.

Doc. IDAuthor IDDocument Title
11Relating Probabilistic Grammars and Automata.
21Similarity-Based Methods for Word Sense Disambiguation.
31Similarity-Based Estimation of Word Cooccurrence Probabilities.
41Dynamic Compilation of Weighted Context-Free Grammars.
51A Structure-Sharing Representation for Unification-Based Grammar Formalisms.
61A Calculus for Semantic Composition and Scoping.
71Inside-Outside Reestimation from Partially Bracketed Corpora.
81Distributional Clustering of English Words.
91Finite-State Approximation of Phrase Structure Grammars.
101An Integrated Framework for Semantic and Pragmatic Interpretation.
111A Semantic-Head-Driven Generation Algorithm for Unification-Based Formalisms.
121The Semantics of Grammar Formalisms Seen as Computer Languages.
131Frequencies vs Biases: Machine Learning Problems in Natural Language Processing (Extended Abstract).
141An Efficient Extension to Mixture Techniques for Prediction and Decision Trees.
151Machine Learning for Efficient Natural-Language Processing.
161Design of a linguistic postprocessor using variable memory length Markov models.
171Grammars and Logics of Partial Information.
181Declarative Programming for a Messy World.
191Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data.
201Maximum Entropy Markov Models for Information Extraction and Segmentation.
211Frequencies vs. Biases: Machine Learning Problems in Natural Language Processing - Abstract.
221Transportability and Generality in a Natural-Language Interface System.
231Semantic Interpretation as Higher-Order Deduction.
241A Sheaf-Theoretic Model of Concurrency
251Shallow Parsing with Conditional Random Fields.
261Document Expansion for Speech Retrieval.
271SCAN: Designing and Evaluating User Interfaces to Support Retrieval From Speech Archives.
281Can Drawing Be Liberated From the Von Neumann Style?
291Prolog and Natural-Language Analysis: Into the Third Decade.
301AT&T at TREC-8.
311AT&T at TREC-7.
321AT&T at TREC-6: SDR Track.
331ATDD: An Algorithmic Tool for Domain Discovery in Protein Sequences.
341A Rational Design for a Weighted Finite-State Transducer Library.
351TEAM: An Experiment in the Design of Transportable Natural-Language Interfaces.
361Introduction.
371Incremental Interpretation.
381Definite Clause Grammars for Language Analysis - A Survey of the Formalism and a Comparison with Augmented Transition Networks.
391An entity tagger for recognizing acquired genomic variations in cancer literature.
401Extraposition Grammars.
411Categorial Semantics and Scoping.
421Semantic-Head-Driven Generation.
431An Efficient Easily Adaptable System for Interpreting Natural Language Queries.
441Ellipsis and Higher-Order Unification
451Linear Logic for Meaning Assembly
461Quantifiers, Anaphora, and Intensionality
471Speech Recognition by Composition of Weighted Finite Automata
481Finite-State Approximation of Phrase-Structure Grammars
491Beyond Word N-Grams
501Aggregate and mixed-order Markov models for statistical language processing
511Similarity-Based Methods For Word Sense Disambiguation
521Similarity-Based Models of Word Cooccurrence Probabilities
531The information bottleneck method
541Language, Computation and Artificial Intelligence.
551TEAM: An Experimental Transportable Natural-Language Interface.
561Logic Programming.
571Principles and Implementation of Deductive Parsing.
581Quantifiers, Anaphora, and Intensionality.
591Similarity-Based Models of Word Cooccurrence Probabilities.
601An Efficient Extension to Mixture Techniques for Prediction and Decision Trees.
611The Design Principles of a Weighted Finite-State Transducer Library.
 
622Adaptive shape-texture intra coding refreshment for error resilient object-based video.
632Cell loss resilience for B-ISDN video communications.
642Image Description and Retrieval Using MPEG-7 Shape Descriptors.
652Automatic Text Extraction in Digital Video Based on Motion Analysis.
662Drift reduction for a H.264/AVC fine grain scalability with motion compensation architecture.
672Objective Evaluation of Relative Segmentation Quality.
682Proposal for an Integrated Video Analysis Framework.
692Shape refreshment need metric for object-based resilient video coding.
702Multi-grid chain coding of binary shapes.
712Hierarchical Visual Description Schemes for Still Images and Video Sequences.
722Influence of Encoder Parameters on the Decoded Video Quality for MPEG-4 over W-CDMA Mobile Networks.
732Motion-based shape error concealment for object-based video.
742An Alternative to the MPEG-4 Object-based Error Resilient Video Syntax.
752An alternative complexity model for the MPEG-4 video verifier mechanism.
762Multimedia Standards: Present and Future.
772The MPEG-21 Standard: Why an Open Multimedia Framework?
782Media Representation Standards for the New Millennium (Tutorial).
792Methodologies for objective evaluation of video segmentation quality.
802Scene level rate control algorithm for MPEG-4 video coding.
812Content Adaptation: The Panacea for Usage Diversity?
822MPEG-21: Goals and Achievements.
832MPEG-7: The Generic Multimedia Content Description Standard, Part 1.
842MPEG-7: A Standard for Multimedia Content Description.
852Classification of video segmentation application scenarios.
862Evaluating MPEG-4 video decoding complexity for an alternative video complexity verifier model.
872Objective evaluation of video segmentation quality.
882Refreshment need metrics for improved shape and texture object-based resilient video coding.
892Spatial shape error concealment for object-based image and video coding.
902Adaptive shape and texture intra refreshment schemes for improved error resilience in object-based video coding.
 
913Register Allocation Via Coloring of Chordal Graphs.
923Topic Introduction.
933A Coordination Model for ad hoc Mobile Systems.
943The Language LinF for Fractal Specification.
953Home Page
963Tactics for Remote Method Invocation.
 
974A Framework for e-Cooperating Business Agents: An Application to the (Re)engineering of Production Facilities.
 
985Parallel DSP Architecture for Reconstruction of Tomographic Images Using Wavelets Techniques.