TY - GEN
T1 - Evaluation data and benchmarks for cascaded speech recognition and entity extraction
AU - Zhou, Liyuan
AU - Suominen, Hanna
AU - Hanlen, Leif
N1 - Publisher Copyright:
© 2015 ACM.
PY - 2015/10/30
Y1 - 2015/10/30
N2 - During clinical handover, clinicians exchange information about the patients and the state of clinical management. To improve care safety and quality, both handover and its documentation have been standardized. Speech recognition and entity extraction provide a way to help health service providers to follow these standards by implementing the handover process as a structured form, whose headings guide the handover narrative, and the documentation process as proofing and sign-off of the automatically filled-out form. In this paper, we evaluate such systems. The form considers the sections of Handover nurse, Patient introduction, My shift, Medication, Appointments, and Future care, divided in 49 mutually exclusive headings to fill out with speech recognized and extracted entities. Our system correctly recognizes 10,244 out of 14,095 spoken words and regardless of 6,692 erroneous words, its error percentage is significantly smaller than for systems submitted to the CLEF eHealth Evaluation Lab 2015. In the extraction of 35 entities with training data (i.e., 14 headings were not present in the 101 expertannotated training documents with 8,487 words in total), the system correctly extracts 2,375 out of 3,793 words in 50 test documents after calibration on 3,937 words in 50 validation documents. This translates to over 90% F1 in extracting information for the patient's age, current bed, current room, and given name and over 70% F1 for patient's admission reason/diagnosis and last name. F1 for filtering out irrelevant information is 78%. We have made the data publicly available for 201 handover cases together with processing results and code and proposed the extraction task for CLEF eHealth 2016.
AB - During clinical handover, clinicians exchange information about the patients and the state of clinical management. To improve care safety and quality, both handover and its documentation have been standardized. Speech recognition and entity extraction provide a way to help health service providers to follow these standards by implementing the handover process as a structured form, whose headings guide the handover narrative, and the documentation process as proofing and sign-off of the automatically filled-out form. In this paper, we evaluate such systems. The form considers the sections of Handover nurse, Patient introduction, My shift, Medication, Appointments, and Future care, divided in 49 mutually exclusive headings to fill out with speech recognized and extracted entities. Our system correctly recognizes 10,244 out of 14,095 spoken words and regardless of 6,692 erroneous words, its error percentage is significantly smaller than for systems submitted to the CLEF eHealth Evaluation Lab 2015. In the extraction of 35 entities with training data (i.e., 14 headings were not present in the 101 expertannotated training documents with 8,487 words in total), the system correctly extracts 2,375 out of 3,793 words in 50 test documents after calibration on 3,937 words in 50 validation documents. This translates to over 90% F1 in extracting information for the patient's age, current bed, current room, and given name and over 70% F1 for patient's admission reason/diagnosis and last name. F1 for filtering out irrelevant information is 78%. We have made the data publicly available for 201 handover cases together with processing results and code and proposed the extraction task for CLEF eHealth 2016.
KW - Entity extraction
KW - Evaluation
KW - Speech recognition
UR - http://www.scopus.com/inward/record.url?scp=84964330879&partnerID=8YFLogxK
U2 - 10.1145/2802558.281464
DO - 10.1145/2802558.281464
M3 - Conference contribution
T3 - SLAM 2015 - Proceedings of the 2015 Workshop on Speech, Language and Audio in Multimedia, co-located with ACM MM 2015
SP - 15
EP - 18
BT - SLAM 2015 - Proceedings of the 2015 Workshop on Speech, Language and Audio in Multimedia, co-located with ACM MM 2015
PB - Association for Computing Machinery, Inc
T2 - 3rd Workshop on Speech, Language and Audio in Multimedia, SLAM 2015
Y2 - 30 October 2015
ER -