TY - GEN
T1 - Community-Led Documentation of Nafsan (Erakor, Vanuatu)
AU - Krajinović, Ana
AU - Billington, Rosey
AU - Emil, Lionel
AU - Kaltap̃au, Gray
AU - Thieberger, Nick
N1 - Publisher Copyright:
© 2022, Springer Nature Switzerland AG.
PY - 2022
Y1 - 2022
N2 - We focus on a collaboration between community members and visiting linguists in Erakor, Vanuatu, aiming to build the capacity of community-based researchers to undertake and sustain documentation of Nafsan, the local indigenous language. We focus on the technical and procedural skills required to collect, manage, and work with audio and video data, and give an overview of the outcomes of a community-led documentation after initial training. We discuss the benefits and challenges of this type of project from the perspective of the community researchers and the external linguists. We show that community-led documentation such as this project in Erakor, in which data management and archiving are incorporated into the documentation process, has crucial benefits for both the community and the linguists. The two most salient benefits are: a) long-term documentation of linguistic and cultural practices calibrated towards community’s needs, and b) collection of larger quantities of data by community members, and often of better quality and scope than those collected by visiting linguists, which, besides being readily available for research, have a great potential for training and testing emerging language technologies for less-resourced languages, such as Automatic Speech Recognition (ASR).
AB - We focus on a collaboration between community members and visiting linguists in Erakor, Vanuatu, aiming to build the capacity of community-based researchers to undertake and sustain documentation of Nafsan, the local indigenous language. We focus on the technical and procedural skills required to collect, manage, and work with audio and video data, and give an overview of the outcomes of a community-led documentation after initial training. We discuss the benefits and challenges of this type of project from the perspective of the community researchers and the external linguists. We show that community-led documentation such as this project in Erakor, in which data management and archiving are incorporated into the documentation process, has crucial benefits for both the community and the linguists. The two most salient benefits are: a) long-term documentation of linguistic and cultural practices calibrated towards community’s needs, and b) collection of larger quantities of data by community members, and often of better quality and scope than those collected by visiting linguists, which, besides being readily available for research, have a great potential for training and testing emerging language technologies for less-resourced languages, such as Automatic Speech Recognition (ASR).
KW - Automatic speech recognition
KW - Community-led language documentation
KW - Less-resourced languages
KW - Nafsan
KW - Technical training
KW - Technology for indigenous languages
KW - Vanuatu
UR - http://www.scopus.com/inward/record.url?scp=85132881743&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-05328-3_8
DO - 10.1007/978-3-031-05328-3_8
M3 - Conference contribution
SN - 9783031053276
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 112
EP - 128
BT - Human Language Technology. Challenges for Computer Science and Linguistics - 9th Language and Technology Conference, LTC 2019, Revised Selected Papers
A2 - Vetulani, Zygmunt
A2 - Paroubek, Patrick
A2 - Kubis, Marek
PB - Springer Science and Business Media Deutschland GmbH
T2 - 9th Language and Technology Conference, LTC 2019
Y2 - 17 May 2019 through 19 May 2019
ER -