TY - GEN
T1 - Adversarial Training of Variational Auto-Encoders for High Fidelity Image Generation
AU - Khan, Salman H.
AU - Hayat, Munawar
AU - Barnes, Nick
N1 - Publisher Copyright:
© 2018 IEEE.
PY - 2018/5/3
Y1 - 2018/5/3
N2 - Variational auto-encoders (VAEs) provide an attractive solution to image generation problem. However, they tend to produce blurred and over-smoothed images due to their dependence on pixel-wise reconstruction loss. This paper introduces a new approach to alleviate this problem in the VAE based generative models. Our model simultaneously learns to match the data, reconstruction loss and the latent distributions of real and fake images to improve the quality of generated samples. To compute the loss distributions, we introduce an auto-encoder based discriminator model which allows an adversarial learning procedure. The discriminator in our model also provides perceptual guidance to the VAE by matching the learned similarity metric of the real and fake samples in the latent space. To stabilize the overall training process, our model uses an error feedback approach to maintain the equilibrium between competing networks in the model. Our experiments show that the generated samples from our proposed model exhibit a diverse set of attributes and facial expressions and scale up to highresolution images very well.
AB - Variational auto-encoders (VAEs) provide an attractive solution to image generation problem. However, they tend to produce blurred and over-smoothed images due to their dependence on pixel-wise reconstruction loss. This paper introduces a new approach to alleviate this problem in the VAE based generative models. Our model simultaneously learns to match the data, reconstruction loss and the latent distributions of real and fake images to improve the quality of generated samples. To compute the loss distributions, we introduce an auto-encoder based discriminator model which allows an adversarial learning procedure. The discriminator in our model also provides perceptual guidance to the VAE by matching the learned similarity metric of the real and fake samples in the latent space. To stabilize the overall training process, our model uses an error feedback approach to maintain the equilibrium between competing networks in the model. Our experiments show that the generated samples from our proposed model exhibit a diverse set of attributes and facial expressions and scale up to highresolution images very well.
UR - http://www.scopus.com/inward/record.url?scp=85050949653&partnerID=8YFLogxK
U2 - 10.1109/WACV.2018.00148
DO - 10.1109/WACV.2018.00148
M3 - Conference contribution
T3 - Proceedings - 2018 IEEE Winter Conference on Applications of Computer Vision, WACV 2018
SP - 1312
EP - 1320
BT - Proceedings - 2018 IEEE Winter Conference on Applications of Computer Vision, WACV 2018
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 18th IEEE Winter Conference on Applications of Computer Vision, WACV 2018
Y2 - 12 March 2018 through 15 March 2018
ER -