TY - JOUR
T1 - Improving child speech disorder assessment by incorporating out-of-domain adult speech
AU - Smith, Daniel
AU - Sneddon, Alex
AU - Ward, Lauren
AU - Duenser, Andreas
AU - Freyne, Jill
AU - Silvera-Tawil, David
AU - Morgan, Angela
N1 - Publisher Copyright:
Copyright © 2017 ISCA.
PY - 2017
Y1 - 2017
N2 - This paper describes the continued development of a system to provide early assessment of speech development issues in children and better triaging to professional services. Whilst corpora of children's speech are increasingly available, recognition of disordered children's speech is still a data-scarce task. Transfer learning methods have been shown to be effective at leveraging out-of-domain data to improve ASR performance in similar data-scarce applications. This paper combines transfer learning, with previously developed methods for constrained decoding based on expert speech pathology knowledge and knowledge of the target text. Results of this study show that transfer learning with out-of-domain adult speech can improve phoneme recognition for disordered children's speech. Specifically, a Deep Neural Network (DNN) trained on adult speech and fine-tuned on a corpus of disordered children's speech reduced the phoneme error rate (PER) of a DNN trained on a children's corpus from 16.3% to 14.2%. Furthermore, this fine-tuned DNN also improved the performance of a Hierarchal Neural Network based acoustic model previously used by the system with a PER of 19.3%. We close with a discussion of our planned future developments of the system.
AB - This paper describes the continued development of a system to provide early assessment of speech development issues in children and better triaging to professional services. Whilst corpora of children's speech are increasingly available, recognition of disordered children's speech is still a data-scarce task. Transfer learning methods have been shown to be effective at leveraging out-of-domain data to improve ASR performance in similar data-scarce applications. This paper combines transfer learning, with previously developed methods for constrained decoding based on expert speech pathology knowledge and knowledge of the target text. Results of this study show that transfer learning with out-of-domain adult speech can improve phoneme recognition for disordered children's speech. Specifically, a Deep Neural Network (DNN) trained on adult speech and fine-tuned on a corpus of disordered children's speech reduced the phoneme error rate (PER) of a DNN trained on a children's corpus from 16.3% to 14.2%. Furthermore, this fine-tuned DNN also improved the performance of a Hierarchal Neural Network based acoustic model previously used by the system with a PER of 19.3%. We close with a discussion of our planned future developments of the system.
KW - Automated Speech Recognition
KW - Speech Assessment Tools
KW - Speech Therapy
UR - http://www.scopus.com/inward/record.url?scp=85039165881&partnerID=8YFLogxK
U2 - 10.21437/Interspeech.2017-455
DO - 10.21437/Interspeech.2017-455
M3 - Conference article
AN - SCOPUS:85039165881
SN - 2308-457X
VL - 2017-August
SP - 2690
EP - 2694
JO - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
JF - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
T2 - 18th Annual Conference of the International Speech Communication Association, INTERSPEECH 2017
Y2 - 20 August 2017 through 24 August 2017
ER -