Abstract
In this paper, the style synthesis network is trained to generate style-synthesized video through the style synthesis through training Stylegan and the video synthesis network for video synthesis. In order to improve the point that the gaze or expression does not transfer stably, 3D face restoration technology is applied to control important features such as the pose, gaze, and expression of the head using 3D face information. In addition, by training the discriminators for the dynamics, mouth shape, image, and gaze of the Head2head network, it is possible to create a stable style synthesis video that maintains more probabilities and consistency. Using the FaceForensic dataset and the MetFace dataset, it was confirmed that the performance was increased by converting one video into another video while maintaining the consistent movement of the target face, and generating natural data through video synthesis using 3D face information from the source video's face.
| Translated title of the contribution | Style Synthesis of Speech Videos Through Generative Adversarial Neural Networks |
|---|---|
| Original language | English |
| Pages (from-to) | 465-472 |
| Number of pages | 8 |
| Journal | 정보처리학회논문지. 소프트웨어 및 데이터 공학 |
| Volume | 11 |
| Issue number | 11 |
| DOIs | |
| State | Published - Nov 2022 |