Abstract
Recent StyleGAN-based face swapping methods have been able to generate very realistic high-resolution face swapping results, but they are often plagued by the challenge of maintaining various attributes (such as expression, pose, and illumination). One reason is that these methods usually focus on the latent codes of facial semantic features corresponding to the W/W+ space, and latent codes in these spaces are often highly entangled. To address this issue, we propose a new method, LDSwap. for disentangling and refusing latent codes in the StyleSpace. The semantic-related latent code disentangling module (SLDM) we propose can successfully achieve facial semantic feature exchange and reorganization by disentangling latent codes. In addition, we propose a channel-split adaptive feature fusion module (CAFF) that adaptively learns and refuses spatial information in the target image. This module can learn spatial features from the target image without interference from the features of the target face region. Through qualitative and quantitative evaluation, we demonstrate that LDSwap shows significant improvements over three state-of-the-art methods in maintaining the appearance of semantic features.
| Original language | English |
|---|---|
| Pages (from-to) | 1041-1058 |
| Number of pages | 18 |
| Journal | Computational Visual Media |
| Volume | 11 |
| Issue number | 5 |
| DOIs | |
| State | Published - 2025 |
Keywords
- disentangling
- face swapping
- latent code
- StyleGAN