Australian English Bilingual Corpus: Automatic forced-alignment accuracy in Russian and English

Ksenia Gnevsheva*, Simon Gonzalez, Robert Fromont

*Corresponding author for this work

    Research output: Contribution to journalArticlepeer-review

    3 Citations (Scopus)

    Abstract

    This paper introduces the Australian English Bilingual Corpus, a Russian–English spoken corpus, and uses it for a comparison of automatic time alignment between two different languages. Automatic forced alignment is gaining popularity in corpus research as it allows for time-efficient processing of phonetic information. The Language, Brain and Behaviour: Corpus Analysis Tool is one aligner which compares well with others in terms of alignment accuracy. Most of the forced-alignment work has been done with different varieties of English. This paper compares alignment accuracy between Russian and English and discusses aligner settings and data characteristics that affect it. The results suggest higher alignment accuracy for English than Russian. For Russian, alignment accuracy improves with stress specification; that is, when stressed and unstressed vowels are treated as separate categories.

    Original languageEnglish
    Pages (from-to)182-193
    Number of pages12
    JournalAustralian Journal of Linguistics
    Volume40
    Issue number2
    DOIs
    Publication statusPublished - 2 Apr 2020

    Fingerprint

    Dive into the research topics of 'Australian English Bilingual Corpus: Automatic forced-alignment accuracy in Russian and English'. Together they form a unique fingerprint.

    Cite this