Application of ASR to a Sociolinguistic Corpus of Australian English

Maya Weiss, Ksenia Gnevsheva, Catherine Travis, Gerard Docherty

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Session 2: Machine Learning

This study applies Automatic Speech Recognition (ASR) to a sociolinguistic corpus of Australian English. We compare a human transcription of excerpts from 20 urban and regional speakers with a transcription generated by Microsoft’s Azure AI Speech. The Word Error Rate is comparable to previous studies, and is not impacted by the sociolinguistic variables of speaker region and gender, nor the phonetic variable of vowel formants. Despite the overall low rate of transcription errors, our findings suggest that the quality of certain vowel categories that are particularly characteristic of Australian English can impact on the accuracy of the ASR-generated transcription.
Original languageEnglish
Title of host publicationProceedings of the Nineteenth Australasian International Conference on Speech Science and Technology
EditorsOlga Maxwell, Rikke Bundgaard-Nielsen
PublisherAustralian Speech Science and Technology Association Inc
Pages27-31
Publication statusPublished - 2024
Event19th Australasian International Conference on Speech Science and Technology - University of Melbourne, Melbourne, Australia
Duration: 3 Dec 20245 Dec 2024
https://assta.org/sst-2024/

Publication series

NameProceedings of the Australasian International Conference on Speech Science and Technology
ISSN (Electronic)2207-1296

Conference

Conference19th Australasian International Conference on Speech Science and Technology
Abbreviated titleSST2024
Country/TerritoryAustralia
CityMelbourne
Period3/12/245/12/24
Internet address

Fingerprint

Dive into the research topics of 'Application of ASR to a Sociolinguistic Corpus of Australian English'. Together they form a unique fingerprint.

Cite this