TY - JOUR
T1 - Multimodal Depression Detection
T2 - Fusion Analysis of Paralinguistic, Head Pose and Eye Gaze Behaviors
AU - Alghowinem, Sharifa
AU - Goecke, Roland
AU - Wagner, Michael
AU - Epps, Julien
AU - Hyett, Matthew
AU - Parker, Gordon
AU - Breakspear, Michael
N1 - Publisher Copyright:
© 2010-2012 IEEE.
PY - 2018/10/1
Y1 - 2018/10/1
N2 - An estimated 350 million people worldwide are affected by depression. Using affective sensing technology, our long-Term goal is to develop an objective multimodal system that augments clinical opinion during the diagnosis and monitoring of clinical depression. This paper steps towards developing a classification system-oriented approach, where feature selection, classification and fusion-based experiments are conducted to infer which types of behaviour (verbal and nonverbal) and behaviour combinations can best discriminate between depression and non-depression. Using statistical features extracted from speaking behaviour, eye activity, and head pose, we characterise the behaviour associated with major depression and examine the performance of the classification of individual modalities and when fused. Using a real-world, clinically validated dataset of 30 severely depressed patients and 30 healthy control subjects, a Support Vector Machine is used for classification with several feature selection techniques. Given the statistical nature of the extracted features, feature selection based on T-Tests performed better than other methods. Individual modality classification results were considerably higher than chance level (83 percent for speech, 73 percent for eye, and 63 percent for head). Fusing all modalities shows a remarkable improvement compared to unimodal systems, which demonstrates the complementary nature of the modalities. Among the different fusion approaches used here, feature fusion performed best with up to 88 percent average accuracy. We believe that is due to the compatible nature of the extracted statistical features.
AB - An estimated 350 million people worldwide are affected by depression. Using affective sensing technology, our long-Term goal is to develop an objective multimodal system that augments clinical opinion during the diagnosis and monitoring of clinical depression. This paper steps towards developing a classification system-oriented approach, where feature selection, classification and fusion-based experiments are conducted to infer which types of behaviour (verbal and nonverbal) and behaviour combinations can best discriminate between depression and non-depression. Using statistical features extracted from speaking behaviour, eye activity, and head pose, we characterise the behaviour associated with major depression and examine the performance of the classification of individual modalities and when fused. Using a real-world, clinically validated dataset of 30 severely depressed patients and 30 healthy control subjects, a Support Vector Machine is used for classification with several feature selection techniques. Given the statistical nature of the extracted features, feature selection based on T-Tests performed better than other methods. Individual modality classification results were considerably higher than chance level (83 percent for speech, 73 percent for eye, and 63 percent for head). Fusing all modalities shows a remarkable improvement compared to unimodal systems, which demonstrates the complementary nature of the modalities. Among the different fusion approaches used here, feature fusion performed best with up to 88 percent average accuracy. We believe that is due to the compatible nature of the extracted statistical features.
KW - Depression detection
KW - eye activity
KW - head pose
KW - multimodal fusion
KW - speaking behaviour
UR - http://www.scopus.com/inward/record.url?scp=85058011712&partnerID=8YFLogxK
U2 - 10.1109/TAFFC.2016.2634527
DO - 10.1109/TAFFC.2016.2634527
M3 - Article
SN - 1949-3045
VL - 9
SP - 478
EP - 490
JO - IEEE Transactions on Affective Computing
JF - IEEE Transactions on Affective Computing
IS - 4
M1 - 7763752
ER -