Improving child speech disorder assessment by incorporating out-of-domain adult speech

Daniel Smith, Alex Sneddon, Lauren Ward, Andreas Duenser, Jill Freyne, David Silvera-Tawil, Angela Morgan

Research output: Contribution to journalConference articlepeer-review

17 Citations (Scopus)

Abstract

This paper describes the continued development of a system to provide early assessment of speech development issues in children and better triaging to professional services. Whilst corpora of children's speech are increasingly available, recognition of disordered children's speech is still a data-scarce task. Transfer learning methods have been shown to be effective at leveraging out-of-domain data to improve ASR performance in similar data-scarce applications. This paper combines transfer learning, with previously developed methods for constrained decoding based on expert speech pathology knowledge and knowledge of the target text. Results of this study show that transfer learning with out-of-domain adult speech can improve phoneme recognition for disordered children's speech. Specifically, a Deep Neural Network (DNN) trained on adult speech and fine-tuned on a corpus of disordered children's speech reduced the phoneme error rate (PER) of a DNN trained on a children's corpus from 16.3% to 14.2%. Furthermore, this fine-tuned DNN also improved the performance of a Hierarchal Neural Network based acoustic model previously used by the system with a PER of 19.3%. We close with a discussion of our planned future developments of the system.

Original languageEnglish
Pages (from-to)2690-2694
Number of pages5
JournalProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Volume2017-August
DOIs
Publication statusPublished - 2017
Externally publishedYes
Event18th Annual Conference of the International Speech Communication Association, INTERSPEECH 2017 - Stockholm, Sweden
Duration: 20 Aug 201724 Aug 2017

Fingerprint

Dive into the research topics of 'Improving child speech disorder assessment by incorporating out-of-domain adult speech'. Together they form a unique fingerprint.

Cite this