TY - JOUR
T1 - Sub-band cepstral distance as an alternative to formants
T2 - Quantitative evidence from a forensic comparison experiment
AU - Kinoshita, Yuko
AU - Osanai, Takashi
AU - Clermont, Frantz
N1 - Publisher Copyright:
© 2022 Elsevier Ltd
PY - 2022/9
Y1 - 2022/9
N2 - This paper demonstrates the potential of the sub-band parametric cepstral distance (PCD) formulated by Clermont and Mokhtari (1994), as an alternative to formants in acoustic phonetic research. As a cepstrum-based measure, the PCD is automatically and reliably extracted from the speech signal. By contrast, formants are time-consuming and often difficult to estimate, a well-known bottleneck for studies based on large-scale datasets. The PCD measure gives flexibility in selecting the frequency limits of any sub-band of interest within the available full band. We suggest that, if sub-band selection were guided by the acoustic–phonetic theory of speech production, PCD analysis could facilitate phonetically meaningful cepstral comparisons without relying directly on formants. We evaluate this idea by exploiting the PCD properties in the context of forensic voice comparison as an application example. The cepstral data were obtained from the vowels uttered by 306 male Japanese speakers. Similar patterns of results were observed using formants and sub-band PCDs, the latter yielding better performance. This suggests that sub-band PCDs are able to capture the spectral characteristics that we normally quantify through formants, but with better reliability and efficiency. The PCD results reported here are encouraging for other types of acoustic phonetic studies in which comparisons of spectral characteristics are required.
AB - This paper demonstrates the potential of the sub-band parametric cepstral distance (PCD) formulated by Clermont and Mokhtari (1994), as an alternative to formants in acoustic phonetic research. As a cepstrum-based measure, the PCD is automatically and reliably extracted from the speech signal. By contrast, formants are time-consuming and often difficult to estimate, a well-known bottleneck for studies based on large-scale datasets. The PCD measure gives flexibility in selecting the frequency limits of any sub-band of interest within the available full band. We suggest that, if sub-band selection were guided by the acoustic–phonetic theory of speech production, PCD analysis could facilitate phonetically meaningful cepstral comparisons without relying directly on formants. We evaluate this idea by exploiting the PCD properties in the context of forensic voice comparison as an application example. The cepstral data were obtained from the vowels uttered by 306 male Japanese speakers. Similar patterns of results were observed using formants and sub-band PCDs, the latter yielding better performance. This suggests that sub-band PCDs are able to capture the spectral characteristics that we normally quantify through formants, but with better reliability and efficiency. The PCD results reported here are encouraging for other types of acoustic phonetic studies in which comparisons of spectral characteristics are required.
KW - Forensic voice comparison
KW - Formants
KW - LPCC
KW - Likelihood ratio
KW - Parametric cepstral distance
KW - Sub-band
KW - Vowel
UR - http://www.scopus.com/inward/record.url?scp=85135724959&partnerID=8YFLogxK
U2 - 10.1016/j.wocn.2022.101177
DO - 10.1016/j.wocn.2022.101177
M3 - Article
SN - 0095-4470
VL - 94
JO - Journal of Phonetics
JF - Journal of Phonetics
M1 - 101177
ER -