ISF-GAN: An Implicit Style Function for High-Resolution Image-to-Image Translation

Yahui Liu; Yajing Chen; Linchao Bao; Nicu Sebe; Bruno Lepri; Marco De Nadai

doi:10.1109/TMM.2022.3159115

ISF-GAN: An Implicit Style Function for High-Resolution Image-to-Image Translation

Yahui Liu, Yajing Chen, Linchao Bao, Nicu Sebe, Bruno Lepri, Marco De Nadai

Computer Science

Research output: Contribution to journal › Article › peer-review

Abstract

Recently, there has been an increasing interest in image editing methods that employ pre-trained unconditional image generators (e.g., StyleGAN). However, applying these methods to translate images to multiple visual domains remains challenging. Existing works do not often preserve the domain-invariant part of the image (e.g., the identity in human face translations), or they do not usually handle multiple domains or allow for multi-modal translations. This work proposes an implicit style function (ISF) to straightforwardly achieve multi-modal and multi-domain image-to-image translation from pre-trained unconditional generators. The ISF manipulates the semantics of a latent code to ensure that the image generated from the manipulated code lies in the desired visual domain. Our human faces and animal image manipulations show significantly improved results over the baselines. Our model enables cost-effective multi-modal unsupervised image-to-image translations at high resolution using pre-trained unconditional GANs. The code and data are available at: https://github.com/yhlleo/stylegan-mmuit.

Original language	English
Article number	9735294
Pages (from-to)	3343-3353
Number of pages	11
Journal	IEEE Transactions on Multimedia
Volume	25
Early online date	15 Mar 2022
DOIs	https://doi.org/10.1109/TMM.2022.3159115
Publication status	Published - 8 Aug 2023

Keywords

Codes
Semantics
Image resolution
Generators
Training
Task analysis
Visualization

Access to Document

10.1109/TMM.2022.3159115

Cite this

@article{7eeff15c2d874dd8a536a53c62700fe9,

title = "ISF-GAN: An Implicit Style Function for High-Resolution Image-to-Image Translation",

abstract = "Recently, there has been an increasing interest in image editing methods that employ pre-trained unconditional image generators (e.g., StyleGAN). However, applying these methods to translate images to multiple visual domains remains challenging. Existing works do not often preserve the domain-invariant part of the image (e.g., the identity in human face translations), or they do not usually handle multiple domains or allow for multi-modal translations. This work proposes an implicit style function (ISF) to straightforwardly achieve multi-modal and multi-domain image-to-image translation from pre-trained unconditional generators. The ISF manipulates the semantics of a latent code to ensure that the image generated from the manipulated code lies in the desired visual domain. Our human faces and animal image manipulations show significantly improved results over the baselines. Our model enables cost-effective multi-modal unsupervised image-to-image translations at high resolution using pre-trained unconditional GANs. The code and data are available at: https://github.com/yhlleo/stylegan-mmuit.",

keywords = "Codes, Semantics, Image resolution, Generators, Training, Task analysis, Visualization",

author = "Yahui Liu and Yajing Chen and Linchao Bao and Nicu Sebe and Bruno Lepri and Nadai, {Marco De}",

year = "2023",

month = aug,

day = "8",

doi = "10.1109/TMM.2022.3159115",

language = "English",

volume = "25",

pages = "3343--3353",

journal = "IEEE Transactions on Multimedia",

issn = "1941-0077",

publisher = "Institute of Electrical and Electronics Engineers (IEEE)",

}

TY - JOUR

T1 - ISF-GAN

T2 - An Implicit Style Function for High-Resolution Image-to-Image Translation

AU - Liu, Yahui

AU - Chen, Yajing

AU - Bao, Linchao

AU - Sebe, Nicu

AU - Lepri, Bruno

AU - Nadai, Marco De

PY - 2023/8/8

Y1 - 2023/8/8

N2 - Recently, there has been an increasing interest in image editing methods that employ pre-trained unconditional image generators (e.g., StyleGAN). However, applying these methods to translate images to multiple visual domains remains challenging. Existing works do not often preserve the domain-invariant part of the image (e.g., the identity in human face translations), or they do not usually handle multiple domains or allow for multi-modal translations. This work proposes an implicit style function (ISF) to straightforwardly achieve multi-modal and multi-domain image-to-image translation from pre-trained unconditional generators. The ISF manipulates the semantics of a latent code to ensure that the image generated from the manipulated code lies in the desired visual domain. Our human faces and animal image manipulations show significantly improved results over the baselines. Our model enables cost-effective multi-modal unsupervised image-to-image translations at high resolution using pre-trained unconditional GANs. The code and data are available at: https://github.com/yhlleo/stylegan-mmuit.

AB - Recently, there has been an increasing interest in image editing methods that employ pre-trained unconditional image generators (e.g., StyleGAN). However, applying these methods to translate images to multiple visual domains remains challenging. Existing works do not often preserve the domain-invariant part of the image (e.g., the identity in human face translations), or they do not usually handle multiple domains or allow for multi-modal translations. This work proposes an implicit style function (ISF) to straightforwardly achieve multi-modal and multi-domain image-to-image translation from pre-trained unconditional generators. The ISF manipulates the semantics of a latent code to ensure that the image generated from the manipulated code lies in the desired visual domain. Our human faces and animal image manipulations show significantly improved results over the baselines. Our model enables cost-effective multi-modal unsupervised image-to-image translations at high resolution using pre-trained unconditional GANs. The code and data are available at: https://github.com/yhlleo/stylegan-mmuit.

KW - Codes

KW - Semantics

KW - Image resolution

KW - Generators

KW - Training

KW - Task analysis

KW - Visualization

U2 - 10.1109/TMM.2022.3159115

DO - 10.1109/TMM.2022.3159115

M3 - Article

SN - 1941-0077

VL - 25

SP - 3343

EP - 3353

JO - IEEE Transactions on Multimedia

JF - IEEE Transactions on Multimedia

M1 - 9735294

ER -

ISF-GAN: An Implicit Style Function for High-Resolution Image-to-Image Translation

Abstract

Keywords

Access to Document

Fingerprint

Cite this