Abstract
This study is a pilot research that explores the effectiveness of a likelihood ratio (LR)-based forensic voice comparison (FVC) system built on non-native speech production. More specifically, it looks at native Hong Kong Cantonese-speaking male productions of English vowels, and the extent to which FVC can work on these speakers. 15 speakers participated in the research, involving two non-contemporaneous recording sessions with six predetermined target words - “hello”, “bye”, “left”, “right”, “yes”, and “no”. Formant frequency values were measured from the trajectories of the vowels and surrounding segments. These trajectories were modelled using discrete cosine transforms for each formant (F1, F2 and F3), and the coefficient values were used as feature vectors in the LR calculations. LRs were calculated using the multivariate-kernel-density method. The results are reported along two metrics of performance, namely the log-likelihood-ratio cost and 95% credible intervals. The six best-performing word-specific outputs are presented and compared. We find that FVC can be built using L2 speech production, and the results are comparable to similar systems built on native speech.
Original language | English |
---|---|
Pages | 39-47 |
Number of pages | 9 |
Publication status | Published - 2015 |
Event | 2015 Australasian Language Technology Association Workshop, ALTA 2015 - Parramatta, Australia Duration: 8 Dec 2015 → 9 Dec 2015 |
Conference
Conference | 2015 Australasian Language Technology Association Workshop, ALTA 2015 |
---|---|
Country/Territory | Australia |
City | Parramatta |
Period | 8/12/15 → 9/12/15 |