TY - JOUR
T1 - Efficient eye typing with 9-direction gaze estimation
AU - Zhang, Chi
AU - Yao, Rui
AU - Cai, Jinpeng
N1 - Publisher Copyright:
© 2017, Springer Science+Business Media, LLC, part of Springer Nature.
PY - 2018/8/1
Y1 - 2018/8/1
N2 - Vision based text entry systems aim to help disabled people achieve text communication using eye movement. Most previous methods have employed an existing eye tracker to predict gaze direction and designed an input method based upon that. However, these methods can result in eye tracking quality becoming easily affected by various factors and lengthy amounts of time for calibration. Our paper presents a novel efficient gaze based text input method, which has the advantage of low cost and robustness. Users can type in words by looking at an on-screen keyboard and blinking. Rather than estimate gaze angles directly to track eyes, we introduce a method that divides the human gaze into nine directions. This method can effectively improve the accuracy of making a selection by gaze and blinks. We built a Convolutional Neural Network (CNN) model for 9-direction gaze estimation. On the basis of the 9-direction gaze, we used a nine-key T9 input method which is widely used in candy bar phones. Bar phones were very popular in the world decades ago and have cultivated strong user habits and language models. To train a robust gaze estimator, we created a large-scale dataset with images of eyes sourced from 25 people. According to the results from our experiments, our CNN model is able to accurately estimate different people’s gaze under various lighting conditions. In considering disable people’s needs, we removed the complex calibration process. The input methods can run in screen mode and portable off-screen mode. Moreover, The datasets used in our experiments are made available to the community to allow further research.
AB - Vision based text entry systems aim to help disabled people achieve text communication using eye movement. Most previous methods have employed an existing eye tracker to predict gaze direction and designed an input method based upon that. However, these methods can result in eye tracking quality becoming easily affected by various factors and lengthy amounts of time for calibration. Our paper presents a novel efficient gaze based text input method, which has the advantage of low cost and robustness. Users can type in words by looking at an on-screen keyboard and blinking. Rather than estimate gaze angles directly to track eyes, we introduce a method that divides the human gaze into nine directions. This method can effectively improve the accuracy of making a selection by gaze and blinks. We built a Convolutional Neural Network (CNN) model for 9-direction gaze estimation. On the basis of the 9-direction gaze, we used a nine-key T9 input method which is widely used in candy bar phones. Bar phones were very popular in the world decades ago and have cultivated strong user habits and language models. To train a robust gaze estimator, we created a large-scale dataset with images of eyes sourced from 25 people. According to the results from our experiments, our CNN model is able to accurately estimate different people’s gaze under various lighting conditions. In considering disable people’s needs, we removed the complex calibration process. The input methods can run in screen mode and portable off-screen mode. Moreover, The datasets used in our experiments are made available to the community to allow further research.
KW - Convolutional neural network
KW - Eye tracking
KW - Gaze estimation
KW - Human-computer interaction
UR - http://www.scopus.com/inward/record.url?scp=85034743223&partnerID=8YFLogxK
U2 - 10.1007/s11042-017-5426-y
DO - 10.1007/s11042-017-5426-y
M3 - Article
SN - 1380-7501
VL - 77
SP - 19679
EP - 19696
JO - Multimedia Tools and Applications
JF - Multimedia Tools and Applications
IS - 15
ER -