G-DAIC: A Gaze Initialized Framework for Description and Aesthetic-Based Image Cropping

Nora Horanyi, Yuqi Hou, Ales Leonardis, Hyung Jin Chang

Research output: Contribution to journalArticlepeer-review

56 Downloads (Pure)

Abstract

We propose a new gaze-initialised optimisation framework to generate aesthetically pleasing image crops based on user description. We extended the existing description-based image cropping dataset by collecting user eye movements corresponding to the image captions. To best leverage the contextual information to initialise the optimisation framework using the collected gaze data, this work proposes two gaze-based initialisation strategies, Fixed Grid and Region Proposal. In addition, we propose the adaptive Mixed scaling method to find the optimal output despite the size of the generated initialisation region and the described part of the image. We address the runtime limitation of the state-of-the-art method by implementing the Early termination strategy to reduce the number of iterations required to produce the output. Our experiments show that G-DAIC reduced the runtime by 92.11%, and the quantitative and qualitative experiments demonstrated that the proposed framework produces higher quality and more accurate image crops w.r.t. user intention.
Original languageEnglish
Article number163
Pages (from-to)1-19
Number of pages19
JournalProceedings of the ACM on Human-Computer Interaction
Volume7
Issue numberETRA
DOIs
Publication statusPublished - 18 May 2023

Keywords

  • Eye-tracking
  • Gaze-based image cropping
  • Aesthetics
  • Deep network re-purposing
  • Image captioning

Fingerprint

Dive into the research topics of 'G-DAIC: A Gaze Initialized Framework for Description and Aesthetic-Based Image Cropping'. Together they form a unique fingerprint.

Cite this