Abstract
This paper proposes a whole new face image-based eye gaze estimation network to solve low generalization performance. Due to the high variance of facial appearance and environmental conditions, conventional methods in gaze estimation have low generalization performance and are easily overfitted to training subjects. To solve this problem, we adopt a self-attention mechanism that has better generalization performance. Nevertheless, applying self-attention directly to an image incurs a high computational cost. Thus, we introduce a new projection that uses convolution in the entire face image to accurate model the local context and reduce the computational cost of self-attention. The proposed model also includes deconvolution that transforms the down-sampled global context to the same size as the input so that spatial information is not lost. We confirmed through observations that the new method achieved state of the art on the EYEDIAP, MPIIFaceGaze, Gaze360 and RT-GENE datasets and achieved a performance increase of 0.02° to 0.30° compared to the other state of the art model. In addition, we show the generalization performance of the proposed model through a cross-dataset evaluation.
Original language | English |
---|---|
Title of host publication | 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) |
Publisher | IEEE |
Pages | 4988-4996 |
Number of pages | 9 |
ISBN (Electronic) | 9781665487399 |
ISBN (Print) | 9781665487405 |
DOIs | |
Publication status | Published - 23 Aug 2022 |
Event | 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) - New Orleans, United States Duration: 19 Jun 2022 → 24 Jun 2022 |
Publication series
Name | IEEE Computer Society Conference on Computer Vision and Pattern Recognition workshops |
---|---|
Publisher | IEEE |
ISSN (Print) | 2160-7508 |
ISSN (Electronic) | 2160-7516 |
Conference
Conference | 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) |
---|---|
Abbreviated title | CVPR 2022 |
Country/Territory | United States |
City | New Orleans |
Period | 19/06/22 → 24/06/22 |
Bibliographical note
Funding Information:This work was supported by the National Research Foundation of Korea through the Korean Government (MSIT) under Grant 2021R1A2B5B01001412, and Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government(MSIT) (No.2021-0-00034, Clustering technologies of fragmented data for time-based data analysis).
Publisher Copyright:
© 2022 IEEE.
Keywords
- Computational modeling
- Convolution
- Deconvolution
- Estimation
- Face recognition
- Training
- Transforms