TY - GEN
T1 - Analyzing and Predicting Emoji Usages in Social Media
AU - Zhao, Peijun
AU - Jia, Jia
AU - An, Yongsheng
AU - Liang, Jie
AU - Xie, Lexing
AU - Luo, Jiebo
N1 - Publisher Copyright:
© 2018 IW3C2 (International World Wide Web Conference Committee), published under Creative Commons CC BY 4.0 License.
PY - 2018/4/23
Y1 - 2018/4/23
N2 - Emojis can be regarded as a language for graphical expression of emotions, and have been widely used in social media. They can express more delicate feelings beyond textual information and improve the effectiveness of computer-mediated communication. Recent advances in machine learning make it possible to automatic compose text messages with emojis. However, the usages of emojis can be complicated and subtle so that analyzing and predicting emojis is a challenging problem. In this paper, we first construct a benchmark dataset of emojis with tweets and systematically investigate emoji usages in terms of tweet content, tweet structure and user demographics. Inspired by the investigation results, we further propose a multitask multimodality gated recurrent unit (mmGRU) model to predict the categories and positions of emojis. The model leverages not only multimodality information such as text, image and user demographics, but also the strong correlations between emoji categories and their positions. Our experimental results show that the proposed method can significantly improve the accuracy for predicting emojis for tweets (+9.0% in F1-value for category and +4.6% in F1-value for position). Based on the experimental results, we further conduct a series of case studies to unveil how emojis are used in social media.
AB - Emojis can be regarded as a language for graphical expression of emotions, and have been widely used in social media. They can express more delicate feelings beyond textual information and improve the effectiveness of computer-mediated communication. Recent advances in machine learning make it possible to automatic compose text messages with emojis. However, the usages of emojis can be complicated and subtle so that analyzing and predicting emojis is a challenging problem. In this paper, we first construct a benchmark dataset of emojis with tweets and systematically investigate emoji usages in terms of tweet content, tweet structure and user demographics. Inspired by the investigation results, we further propose a multitask multimodality gated recurrent unit (mmGRU) model to predict the categories and positions of emojis. The model leverages not only multimodality information such as text, image and user demographics, but also the strong correlations between emoji categories and their positions. Our experimental results show that the proposed method can significantly improve the accuracy for predicting emojis for tweets (+9.0% in F1-value for category and +4.6% in F1-value for position). Based on the experimental results, we further conduct a series of case studies to unveil how emojis are used in social media.
KW - emoji
KW - gru
KW - multimodality
KW - multitask
UR - http://www.scopus.com/inward/record.url?scp=85057387630&partnerID=8YFLogxK
U2 - 10.1145/3184558.3186344
DO - 10.1145/3184558.3186344
M3 - Conference contribution
T3 - The Web Conference 2018 - Companion of the World Wide Web Conference, WWW 2018
SP - 327
EP - 334
BT - The Web Conference 2018 - Companion of the World Wide Web Conference, WWW 2018
PB - Association for Computing Machinery, Inc
T2 - 27th International World Wide Web, WWW 2018
Y2 - 23 April 2018 through 27 April 2018
ER -