TY - JOUR
T1 - Assessment of L2 intelligibility
T2 - Comparing L1 listeners and automatic speech recognition
AU - Inceoglu, Solène
AU - Chen, Wen Hsin
AU - Lim, Hyojung
N1 - Publisher Copyright:
© 2023 Cambridge University Press. All rights reserved.
PY - 2023/1/18
Y1 - 2023/1/18
N2 - An increasing number of studies are exploring the benefits of automatic speech recognition (ASR)-based dictation programs for second language (L2) pronunciation learning (e.g. Chen, Inceoglu & Lim, 2020; Liakin, Cardoso & Liakina, 2015; McCrocklin, 2019), but how ASR recognizes accented speech and the nature of the feedback it provides to language learners is still largely under-researched. The current study explores whether the intelligibility of L2 speakers differs when assessed by native (L1) listeners versus ASR technology, and reports on the types of intelligibility issues encountered by the two groups. Twelve L1 listeners of English transcribed 48 isolated words targeting the -i/ and /æ-ϵ/ contrasts and 24 short sentences that four Taiwanese intermediate learners of English had produced using Google's ASR dictation system. Overall, the results revealed lower intelligibility scores for the word task (ASR: 40.81%, L1 listeners: 38.62%) than the sentence task (ASR: 75.52%, L1 listeners: 83.88%), and highlighted strong similarities in the error types - and their proportions - identified by ASR and the L1 listeners. However, despite similar recognition scores, correlations indicated that the ASR recognition of the L2 speakers' oral productions mirrored the L1 listeners' judgments of intelligibility in the word and sentence tasks for only one speaker, with significant positive correlations for one additional speaker in each task. This suggests that the extent to which ASR approaches L1 listeners at recognizing accented speech may depend on individual speakers and the type of oral speech.
AB - An increasing number of studies are exploring the benefits of automatic speech recognition (ASR)-based dictation programs for second language (L2) pronunciation learning (e.g. Chen, Inceoglu & Lim, 2020; Liakin, Cardoso & Liakina, 2015; McCrocklin, 2019), but how ASR recognizes accented speech and the nature of the feedback it provides to language learners is still largely under-researched. The current study explores whether the intelligibility of L2 speakers differs when assessed by native (L1) listeners versus ASR technology, and reports on the types of intelligibility issues encountered by the two groups. Twelve L1 listeners of English transcribed 48 isolated words targeting the -i/ and /æ-ϵ/ contrasts and 24 short sentences that four Taiwanese intermediate learners of English had produced using Google's ASR dictation system. Overall, the results revealed lower intelligibility scores for the word task (ASR: 40.81%, L1 listeners: 38.62%) than the sentence task (ASR: 75.52%, L1 listeners: 83.88%), and highlighted strong similarities in the error types - and their proportions - identified by ASR and the L1 listeners. However, despite similar recognition scores, correlations indicated that the ASR recognition of the L2 speakers' oral productions mirrored the L1 listeners' judgments of intelligibility in the word and sentence tasks for only one speaker, with significant positive correlations for one additional speaker in each task. This suggests that the extent to which ASR approaches L1 listeners at recognizing accented speech may depend on individual speakers and the type of oral speech.
KW - CALL
KW - automatic speech recognition
KW - intelligibility
KW - non-native speech
KW - pronunciation learning
UR - http://www.scopus.com/inward/record.url?scp=85144457349&partnerID=8YFLogxK
U2 - 10.1017/S0958344022000192
DO - 10.1017/S0958344022000192
M3 - Article
SN - 0958-3440
VL - 35
SP - 89
EP - 104
JO - ReCALL
JF - ReCALL
IS - 1
ER -