Towards distributional semantics-based classification of collocations for collocation dictionaries

Leo Wanner, Gabriela Ferraro, Pol Moreno

Research output: Contribution to journalArticlepeer-review

14 Citations (Scopus)

Abstract

Automatic acquisition of raw source material is of great aid for the compilation of dictionaries, and, in particular, of specialized dictionaries such as collocation dictionaries. The extraction of collocations from corpora has been actively worked on since the late eighties. The quality of the state-of-the-art extraction algorithms allows the lexicographers to obtain lists of collocations they can work with. However, mere lists of collocations are not sufficient. In collocation dictionaries, collocations are grouped se-mantically, which also presupposes a semantic classification of collocations. In this article, a distributional semantics-based model is proposed that classifies collocations with respect to broad semantic categories as encountered in dictionaries. In experiments with Spanish verb-noun and noun-adjective collocations from the lexicographic field of emotion nouns, it is shown that the use of features extracted from the context of collocations is decisive for retrieval of draft entries for collocation dictionaries.

Original languageEnglish
Pages (from-to)167-186
Number of pages20
JournalInternational Journal of Lexicography
Volume30
Issue number2
DOIs
Publication statusPublished - 2017
Externally publishedYes

Fingerprint

Dive into the research topics of 'Towards distributional semantics-based classification of collocations for collocation dictionaries'. Together they form a unique fingerprint.

Cite this