EvoSem: A database of polysemous cognate sets

Mathieu Dehouck, Alexandre François, Siva Kalyan, Martial Pastor, David Kletz

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Polysemies, or “colexifications”, are of great interest in cognitive and historical linguistics, since meanings that are frequently expressed by the same lexeme are likely to be conceptually similar, and lie along a common pathway of semantic change. We argue that these types of inferences can be more reliably drawn from polysemies of cognate sets (which we call “dialexifications”) than from polysemies of lexemes. After giving a precise definition of dialexification, we introduce EvoSem, a cross-linguistic database of etymologies scraped from several online sources. Based on this database (publicly available at http://tiny.cc/EvoSem), we measure for each pair of senses how many cognate sets include them both-i.e. how often this pair of senses is “dialexified”. This allows us to construct a weighted dialexification graph for any set of senses, indicating the conceptual and historical closeness of each pair. We also present an online interface for browsing our database, including graphs and interactive tables. We then discuss potential applications to NLP tasks and to linguistic research.

Original languageEnglish
Title of host publicationLChange 2023 - 4th International Workshop on Computational Approaches to Historical Language Change 2023, Proceedings
EditorsNina Tahmasebi, Syrielle Montariol, Haim Dubossarsky, Haim Dubossarsky, Andrey Kutuzov, Simon Hengchen, David Alfter, Francesco Periti, Pierluigi Cassotti
PublisherAssociation for Computational Linguistics (ACL)
Pages66-75
Number of pages10
ISBN (Electronic)9798891760431
ISBN (Print)9798891760431
Publication statusPublished - 2023
Externally publishedYes
Event4th International Workshop on Computational Approaches to Historical Language Change, LChange 2023 - Singapore, Singapore
Duration: 6 Dec 2023 → …

Publication series

NameLChange 2023 - 4th International Workshop on Computational Approaches to Historical Language Change 2023, Proceedings

Conference

Conference4th International Workshop on Computational Approaches to Historical Language Change, LChange 2023
Country/TerritorySingapore
CitySingapore
Period6/12/23 → …

Fingerprint

Dive into the research topics of 'EvoSem: A database of polysemous cognate sets'. Together they form a unique fingerprint.

Cite this