Human Judgements of Concept Similarity

ConceptSim: Sense-Annotated Judgements of Similarity


Data [ ConceptSim.tar.gz ] [ ConceptSim.zip ] ConceptSim contains sense-annotated versions of three standard similarity datasets: MC RG WordSim-Sim Each pair of words was annotated by two humans with WordNet 3.0 senses. The inter-annotator agreement ranged from 86% - 93%. The similarity scores themselves are maintained from the original datasets (motivated by past research showing greatest correlations with human judgments coming from the maximum similarity over all pairs of senses). The final version of each sense-annotated dataset was the result of annotators coming to an agreement on disagreed senses. Related Publications: [ pdf ] Hansen A. Schwartz, Fernando Gomez. 2011. Evaluating Semantic Metrics on Tasks of Concept Similarity. In FLAIRS-24. Palm Beach, Florida. Related Software: [ Normalized Depth Similarity Demo ] Concept similarity based on normalized WordNet (described in the CoNLL-08 paper above). Original Data References Rubenstein, H., and Goodenough, J. 1965. Contextual correlates of synonymy. Communications of the ACM 8:627-633. Miller, G., and Charles, W. 1991. Contextual correlates of semantic similarity. Language and Cognitive Processes 6(1):1-28. [ link ] Finkelstein, L.; Gabrilovich, E.; Matias, Y.; Rivlin, E.; Solan, Z.; Wolfman, G.; and Ruppin, E. 2001. Placing search in context: The concept revisited. In ACM Trans. on Information Systems. [ link ] Agirre, E.; Alfonseca, E.; Hall, K.; Kravalova, J.; Pasca, M.; and Soroa, A. 2009. A study on similarity and relatedness using distributional and wordnet-based approaches. In The Annual Conference of the NAACL, 19-27.

home

ConceptSim:

Sense-Annotated Judgements of Similarity