Paraphrasing and Entailment

PPDB 2.0: Most recent release of the paraphrase database

Download Paper Citation

SimplePPDB: Subset of PPDB customized for performing text simplification

Download Paper Citation

Add-One RTE data: 5,560 RTE sentence pairs involving the insertion of a single adjective.

Download Paper Citation

Human Paraphrase Judgements: Phrase pairs scored on a 5-point scale

Download Paper Citation

Human Lexical Entailment Judgements: Phrase pairs classified based on natural logic relations

Download Paper Citation

FrameNet+: Expanded FrameNet LU index, built via automatic paraphrasing and crowdsourcing

Download Paper Citation

Stylistics

Style Lexicons: Human and automatic scores of formality and complexity for words, phrases, and sentences

Download Paper Citation

Formality Annotations: Sentence-level formality annotations for four different genres

Download Paper Citation

Crowdsourcing

Bilingual dictionaries in 100 languages: High-confidence translations collected via crowdsourcing

Download Paper Citation

Code for extracting dictionaries: Code and data for building dictionaries. Download this file if you want to change the default quality thresholds, or if you are interested in the demographic information about our MTurk translators.

Download Paper Citation

Lexical entailment classification HIT Template: 5-way classification of phrase pairs into natural logic relations

Download Paper Citation

Likert scale paraphrase HIT Template: Likert-style paraphrase judgement (on a 5 point scale)

Download Paper Citation

Binary paraphrase judgement: Binary judgements of paraphrase goodness (in a given context)

Download Paper Citation
Creative Commons License
All of the above code and data is licensed under a Creative Commons Attribution 3.0 United States License.