TY - JOUR
T1 - More is better
T2 - Likelihood ratio-based forensic voice comparison with vocalic segmental cepstra frontends
AU - Rose, Phil
PY - 2013
Y1 - 2013
N2 - The suitability of vowel cepstral spectra for forensic voice comparison is explored within a likelihood ratio-based framework, and non-technical explanations provided for some basic concepts of cepstral analysis and forensic voice comparison. Non-contemporaneous landline telephone recordings of 297 male Japanese speakers are compared using only two replicates per recording of each of their five read-out vowels. 14 cepstrally-mean-subtracted LPC cepstral coefficients modelling the spectral shape to 5 kHz are used as features. When evaluated intrinsically with kernel density multivariate likelihood ratios, all 297 same-speaker comparisons are correctly discriminated as coming from the same speaker, and only 173 of the 43,956 different-speaker comparisons (0.4%) are incorrectly evaluated as coming from the same speaker. The log-likelihood ratio cost for this comparison is very low at 0.013. Fusion with a speaker's long-term spectral data marginally improves the different-speaker error rate to 0.27% and the log-likelihood ratio cost to 0.009. It is concluded that the approach warrants further examination.
AB - The suitability of vowel cepstral spectra for forensic voice comparison is explored within a likelihood ratio-based framework, and non-technical explanations provided for some basic concepts of cepstral analysis and forensic voice comparison. Non-contemporaneous landline telephone recordings of 297 male Japanese speakers are compared using only two replicates per recording of each of their five read-out vowels. 14 cepstrally-mean-subtracted LPC cepstral coefficients modelling the spectral shape to 5 kHz are used as features. When evaluated intrinsically with kernel density multivariate likelihood ratios, all 297 same-speaker comparisons are correctly discriminated as coming from the same speaker, and only 173 of the 43,956 different-speaker comparisons (0.4%) are incorrectly evaluated as coming from the same speaker. The log-likelihood ratio cost for this comparison is very low at 0.013. Fusion with a speaker's long-term spectral data marginally improves the different-speaker error rate to 0.27% and the log-likelihood ratio cost to 0.009. It is concluded that the approach warrants further examination.
KW - Cepstrum
KW - Forensic voice comparison
KW - Likelihood ratio
KW - Vowel spectra
UR - http://www.scopus.com/inward/record.url?scp=84880076430&partnerID=8YFLogxK
U2 - 10.1558/ijsll.v20i1.77
DO - 10.1558/ijsll.v20i1.77
M3 - Article
SN - 1748-8885
VL - 20
SP - 77
EP - 116
JO - International Journal of Speech, Language and the Law
JF - International Journal of Speech, Language and the Law
IS - 1
ER -