A Phylogeny of Yam Languages

Mae J. Carroll*, Sam Passmore, Christian Döhler, Nicholas Evans

*Corresponding author for this work

Research output: Contribution to conferenceAbstractpeer-review

Abstract

The Yam languages of the southern New Guinea region are one of the world’s primary lan-
guage families, i.e. not related to any other family in the world (Evans et al., 2017). They are the
fourth-largest family in New Guinea by number of languages, after the Trans-New Guinea, Torricelli,
and Sepik-Ramu families, but due to recent and intense documentation efforts, are the second-most
documented. On the back of this field-work effort, we propose a fine-grained quantitatively-grounded phylogeny of these languages representing a history of the family.

The major examples of documentation come in the form of three descriptive grammars (Car-
roll, 2016; Döhler, 2018; Siegel, 2023) as well as a sketch grammar and language documentation
project on the previously little-known Yei branch (Carroll, in press), as well as PhDs describing as-
pects of Yam languages such as the grammar of Ranmo (Lee, 2016) and another on variation in Nmbo (Kashima, 2020). This is in addition to numerous papers on languages of the family, too numerous to mention here.

The extended and rigorous documentation effort has created the space for developing new
knowledge on the history of the family, of which we currently know very little. Preliminary results
on historical reconstructions were published by Evans, et al., (2017), with updated results currently in
press (Evans et al., in press). This paper develops these works and previous conference presentations
(Evans et al., 2017b) to produce a high-definition phylogeny of the language family.

The phylogeny is built using lexical data drawn from an expanded version of the Yamfinder
lexical database (Carroll et al., online; yamfinder.com). This includes a list of 388 core vocabulary
items chosen for relevance in the region taken from 26 Yam languages we have data for. This list
has then been annotated for lexical cognates based on what we know regarding the historical reconstructions (Evans et al., 2017, 2017b, in press). This cognate data then serves as evidence for shared historical relatedness. We have provided a preliminary phylogeny in Figure 1 using these cognates.

In this tree, the cognates were used to calculate symmetric generalized Robinson-Foulds Distances
between languages using LingPy (List & Forkel, 2021). The tree was generated from these distances
using a basic UPGMA (unweighted pair group method with arithmetic mean) clustering algorithm.
We extend from the automated approach presented in Figure 1 by building a Bayesian phy-
logeny from the cognate data and providing a high resolution branching-tree model of the language
family history (Bouckaert et al. 2019). We then use this model to calculate concordance factors for
each cognate set (Minh et al. 2020). Concordance factors measure the extent a particular cognate set agrees with the branching model, and identifies cognates that present alternative branching histories within this complex linguistic region.

The results are a significant step forward in models of history in Papuan languages. This represents arguably the most detailed and accurate phylogeny of any Papuan family to date, with potential exception of Trans-New Guinea (Greenhill, in press). This phylogeny also represents the next step in the long-term goal of unlocking potential deeper time-depth relationships in the region beyond what is possible given current comparative methods.
Original languageEnglish
Pages165
Number of pages166
Publication statusPublished - 26 Nov 2024
EventAustralian Linguistic Society Annual Conference - University of Melbourne, Australia
Duration: 1 Jan 2013 → …

Conference

ConferenceAustralian Linguistic Society Annual Conference
Country/TerritoryAustralia
Period1/01/13 → …
Other1-4 October 2013

Fingerprint

Dive into the research topics of 'A Phylogeny of Yam Languages'. Together they form a unique fingerprint.

Cite this