To compress or not to compress? A finite-state approach to nen verbal morphology

Saliha Muradoglu, Nicholas Evans, Hanna Suominen

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    4 Citations (Scopus)

    Abstract

    This paper describes the development of a verbal morphological parser for an under-resourced Papuan language, Nen. Nen verbal morphology is particularly complex, with a transitive verb taking up to 1, 740 unique features. The structural properties exhibited by Nen verbs raises interesting choices for analysis. Here we compare two possible methods of analysis: ‘Chunking’ and decomposition. ‘Chunking’ refers to the concept of collating morphological segments into one, whereas the decomposition model follows a more classical linguistic approach. Both models are built using the Finite-State Transducer toolkit foma. The resultant architecture shows differences in size and structural clarity. While the ‘Chunking’ model is under half the size of the full decomposed counterpart, the decomposition displays higher structural order. In this paper, we describe the challenges encountered when modelling a language exhibiting distributed exponence and present the first morphological analyser for Nen, with an overall accuracy of 80.3%.

    Original languageEnglish
    Title of host publicationACL 2020 - 58th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Student Research Workshop
    PublisherAssociation for Computational Linguistics (ACL)
    Pages207-213
    Number of pages7
    ISBN (Electronic)9781952148033
    Publication statusPublished - 2020
    Event58th Annual Meeting of the Association for Computational Linguistics, ACL 2020 - Student Research Workshop, SRW 2020 - Virtual, Online, United States
    Duration: 5 Jul 202010 Jul 2020

    Publication series

    NameProceedings of the Annual Meeting of the Association for Computational Linguistics
    ISSN (Print)0736-587X

    Conference

    Conference58th Annual Meeting of the Association for Computational Linguistics, ACL 2020 - Student Research Workshop, SRW 2020
    Country/TerritoryUnited States
    CityVirtual, Online
    Period5/07/2010/07/20

    Fingerprint

    Dive into the research topics of 'To compress or not to compress? A finite-state approach to nen verbal morphology'. Together they form a unique fingerprint.

    Cite this