Forensic voice comparison using sub-band cepstral distances as features: A first attempt with vowels from 306 Japanese speakers under channel mismatch conditions

Yuko Kinoshita, Takashi Osanai, Frantz Clermont

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    Abstract

    This study presents the latter part of an exploratory study of the potential of sub-band parametric cepstral distance (PCD) as an alternative forensic voice comparison (FVC) feature to formants and cepstral coefficients. Using 5 Japanese vowels produced by 306 male Japanese speakers, we conducted LRbased FVC experiments under a channel mismatch condition, with sub-bands selected in reference to the expected formant locations. Combining 3 sub-band PCDs from F1, F2, and F3 ranges, sub-band PCDs outperformed the full-band PCDs in speaker classification, demonstrating their promise as an automatically extractable, robust, and linguistically interpretable acoustic feature for FVC. Index Terms: Sub-band cepstral distance, likelihood ratio, forensic voice comparison, channel mismatch, Japanese vowels
    Original languageEnglish
    Title of host publicationProceedings of the 17th Australasian International Conference on Speech Science and Technology
    EditorsJ Epps, J Wolfe, J Smith & C Jones
    Place of PublicationAustralia
    PublisherThe Australasian Speech Science and Technology Association, Inc.
    Pages45-48
    EditionPeer reviewed
    ISBN (Print)2207-1296
    Publication statusPublished - 2018
    Event17th Australasian International Conference on Speech Science and Technology - Sydney, Australia, Australia
    Duration: 1 Jan 2018 → …

    Conference

    Conference17th Australasian International Conference on Speech Science and Technology
    Country/TerritoryAustralia
    Period1/01/18 → …
    OtherDecember 4-7 2018

    Fingerprint

    Dive into the research topics of 'Forensic voice comparison using sub-band cepstral distances as features: A first attempt with vowels from 306 Japanese speakers under channel mismatch conditions'. Together they form a unique fingerprint.

    Cite this