TY - JOUR
T1 - Automated dating of the world's language families based on lexical similarity
AU - Holman, Eric W.
AU - Brown, Cecil H.
AU - Wichmann, Søren
AU - Müller, André
AU - Velupillai, Viveka
AU - Hammarström, Harald
AU - Sauppe, Sebastian
AU - Jung, Hagen
AU - Bakker, Dik
AU - Brown, Pamela
AU - Belyaev, Oleg
AU - Urban, Matthias
AU - Mailhammer, Robert
AU - List, Johann Mattis
AU - Egorov, Dmitry
PY - 2011/12
Y1 - 2011/12
N2 - This paper describes a computerized alternative to glottochronology for estimating elapsed time since parent languages diverged into daughter languages. The method, developed by the Automated Similarity Judgment Program (ASJP) consortium, is different from glottochronology in four major respects: (1) it is automated and thus is more objective, (2) it applies a uniform analytical approach to a single database of worldwide languages, (3) it is based on lexical similarity as determined from Levenshtein (edit) distances rather than on cognate percentages, and (4) it provides a formula for date calculation that mathematically recognizes the lexical heterogeneity of individual languages, including parent languages just before their breakup into daughter languages. Automated judgments of lexical similarity for groups of related languages are calibrated with historical, epigraphic, and archaeological divergence dates for 52 language groups. The discrepancies between estimated and calibration dates are found to be on average 29% as large as the estimated dates themselves, a figure that does not differ significantly among language families. As a resource for further research that may require dates of known level of accuracy, we offer a list of ASJP time depths for nearly all the world's recognized language families and for many subfamilies.
AB - This paper describes a computerized alternative to glottochronology for estimating elapsed time since parent languages diverged into daughter languages. The method, developed by the Automated Similarity Judgment Program (ASJP) consortium, is different from glottochronology in four major respects: (1) it is automated and thus is more objective, (2) it applies a uniform analytical approach to a single database of worldwide languages, (3) it is based on lexical similarity as determined from Levenshtein (edit) distances rather than on cognate percentages, and (4) it provides a formula for date calculation that mathematically recognizes the lexical heterogeneity of individual languages, including parent languages just before their breakup into daughter languages. Automated judgments of lexical similarity for groups of related languages are calibrated with historical, epigraphic, and archaeological divergence dates for 52 language groups. The discrepancies between estimated and calibration dates are found to be on average 29% as large as the estimated dates themselves, a figure that does not differ significantly among language families. As a resource for further research that may require dates of known level of accuracy, we offer a list of ASJP time depths for nearly all the world's recognized language families and for many subfamilies.
UR - http://www.scopus.com/inward/record.url?scp=82955168011&partnerID=8YFLogxK
U2 - 10.1086/662127
DO - 10.1086/662127
M3 - Article
SN - 0011-3204
VL - 52
SP - 841
EP - 875
JO - Current Anthropology
JF - Current Anthropology
IS - 6
ER -