Multi-channel attention selection gan with cascaded semantic guidance for cross-view image translation

Hao Tang, Dan Xu, Nicu Sebe, Yanzhi Wang, Jason J. Corso, Yan Yan

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

183 Scopus citations

Abstract

Cross-view image translation is challenging because it involves images with drastically different views and severe deformation. In this paper, we propose a novel approach named Multi-Channel Attention SelectionGAN (SelectionGAN) that makes it possible to generate images of natural scenes in arbitrary viewpoints, based on an image of the scene and a novel semantic map. The proposed SelectionGAN explicitly utilizes the semantic information and consists of two stages. In the first stage, the condition image and the target semantic map are fed into a cycled semantic-guided generation network to produce initial coarse results. In the second stage, we refine the initial results by using a multi-channel attention selection mechanism. Moreover, uncertainty maps automatically learned from attentions are used to guide the pixel loss for better network optimization. Extensive experiments on Dayton, CVUSA and Ego2Top datasets show that our model is able to generate significantly better results than the state-of-the-art methods. The source code, data and trained models are available at https://github.com/Ha0Tang/SelectionGAN.

Original languageEnglish
Title of host publicationProceedings - 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019
Pages2412-2421
Number of pages10
ISBN (Electronic)9781728132938
DOIs
StatePublished - Jun 2019
Event32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019 - Long Beach, United States
Duration: 16 Jun 201920 Jun 2019

Publication series

NameProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
Volume2019-June
ISSN (Print)1063-6919

Conference

Conference32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019
Country/TerritoryUnited States
CityLong Beach
Period16/06/1920/06/19

Keywords

  • Image and Video Synthesis

Fingerprint

Dive into the research topics of 'Multi-channel attention selection gan with cascaded semantic guidance for cross-view image translation'. Together they form a unique fingerprint.

Cite this