TY - JOUR
T1 - Lightme
T2 - analysing language in internet support groups for mental health
AU - Ferraro, Gabriela
AU - Loo Gee, Brendan
AU - Ji, Shenjia
AU - Salvador-Carulla, Luis
N1 - Publisher Copyright:
© 2020, Springer Nature Switzerland AG.
PY - 2020/12/1
Y1 - 2020/12/1
N2 - Background: Assisting moderators to triage harmful posts in Internet Support Groups is relevant to ensure its safe use. Automated text classification methods analysing the language expressed in posts of online forums is a promising solution. Methods: Natural Language Processing and Machine Learning technologies were used to build a triage post classifier using a dataset from Reachout.com mental health forum for young people. Results: When comparing with the state-of-the-art, a solution mainly based on features from lexical resources, received the best classification performance for the crisis posts (52%), which is the most severe class. Six salient linguistic characteristics were found when analysing the crisis post; (1) posts expressing hopelessness, (2) short posts expressing concise negative emotional responses, (3) long posts expressing variations of emotions, (4) posts expressing dissatisfaction with available health services, (5) posts utilising storytelling, and (6) posts expressing users seeking advice from peers during a crisis. Conclusion: It is possible to build a competitive triage classifier using features derived only from the textual content of the post. Further research needs to be done in order to translate our quantitative and qualitative findings into features, as it may improve overall performance.
AB - Background: Assisting moderators to triage harmful posts in Internet Support Groups is relevant to ensure its safe use. Automated text classification methods analysing the language expressed in posts of online forums is a promising solution. Methods: Natural Language Processing and Machine Learning technologies were used to build a triage post classifier using a dataset from Reachout.com mental health forum for young people. Results: When comparing with the state-of-the-art, a solution mainly based on features from lexical resources, received the best classification performance for the crisis posts (52%), which is the most severe class. Six salient linguistic characteristics were found when analysing the crisis post; (1) posts expressing hopelessness, (2) short posts expressing concise negative emotional responses, (3) long posts expressing variations of emotions, (4) posts expressing dissatisfaction with available health services, (5) posts utilising storytelling, and (6) posts expressing users seeking advice from peers during a crisis. Conclusion: It is possible to build a competitive triage classifier using features derived only from the textual content of the post. Further research needs to be done in order to translate our quantitative and qualitative findings into features, as it may improve overall performance.
UR - http://www.scopus.com/inward/record.url?scp=85116479854&partnerID=8YFLogxK
U2 - 10.1007/s13755-020-00115-7
DO - 10.1007/s13755-020-00115-7
M3 - Article
SN - 2047-2501
VL - 8
JO - Health Information Science and Systems
JF - Health Information Science and Systems
IS - 1
M1 - 34
ER -