Vol. 2 No. 1 (2022): Advances in Deep Learning Techniques
Articles

Pushing Boundaries with Deep Generative Models: Innovations and Applications of VAEs and GANs

Prabu Ravichandran
Sr. Data Architect, Amazon Web Services Inc., Raleigh, NC, USA
Cover

Published 08-06-2022

Keywords

  • deep generative models,
  • variational autoencoders,
  • VAEs,
  • generative adversarial networks,
  • GANs,
  • conditional generation,
  • style transfer,
  • multimodal synthesis,
  • applications,
  • innovations
  • ...More
    Less

How to Cite

[1]
P. Ravichandran, “Pushing Boundaries with Deep Generative Models: Innovations and Applications of VAEs and GANs”, Adv. in Deep Learning Techniques, vol. 2, no. 1, pp. 37–48, Jun. 2022.

Abstract

This paper delves into the cutting-edge realm of deep generative models, specifically focusing on variational autoencoders (VAEs) and generative adversarial networks (GANs). We explore the innovations and applications that have pushed the boundaries of these models, enabling them to generate realistic data across various domains. Beginning with an overview of VAEs and GANs, we delve into recent advancements such as conditional generation, style transfer, and multimodal synthesis. We discuss how these models have been utilized in diverse fields including image generation, text-to-image synthesis, and drug discovery. Furthermore, we examine challenges and future directions in the field, emphasizing the importance of ethical considerations and interpretability. Through this comprehensive analysis, we illustrate the immense potential of VAEs and GANs in driving innovation and fostering novel applications across disciplines.

References

  1. Goodfellow, Ian, et al. "Generative Adversarial Nets." Proceedings of the 27th International Conference on Neural Information Processing Systems (NIPS'14), 2014, pp. 2672-2680.
  2. Kingma, Diederik P., and Max Welling. "Auto-Encoding Variational Bayes." Proceedings of the International Conference on Learning Representations (ICLR), 2014.
  3. Radford, Alec, et al. "Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks." Proceedings of the International Conference on Learning Representations (ICLR), 2016.
  4. Larsen, Anders Boesen Lindbo, et al. "Autoencoding Beyond Pixels Using a Learned Similarity Metric." Proceedings of the 33rd International Conference on Machine Learning (ICML'16), 2016, pp. 1558-1566.
  5. Denton, Emily L., Soumith Chintala, and Rob Fergus. "Deep Generative Image Models Using a Laplacian Pyramid of Adversarial Networks." Advances in Neural Information Processing Systems 28 (NIPS'15), 2015, pp. 1486-1494.
  6. Salimans, Tim, et al. "Improved Techniques for Training GANs." Advances in Neural Information Processing Systems 29 (NIPS'16), 2016, pp. 2234-2242.
  7. Kingma, Diederik P., and Jimmy Ba. "Adam: A Method for Stochastic Optimization." Proceedings of the 3rd International Conference on Learning Representations (ICLR), 2015.
  8. Arjovsky, Martin, Soumith Chintala, and Léon Bottou. "Wasserstein GAN." Proceedings of the 34th International Conference on Machine Learning (ICML'17), 2017, pp. 214-223.
  9. Higgins, Irina, et al. "beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework." Proceedings of the 5th International Conference on Learning Representations (ICLR), 2017.
  10. Dumoulin, Vincent, et al. "Adversarially Learned Inference." Proceedings of the 34th International Conference on Machine Learning (ICML'17), 2017, pp. 877-885.
  11. Chen, Xi, et al. "InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets." Advances in Neural Information Processing Systems 29 (NIPS'16), 2016, pp. 2172-2180.
  12. Liu, Ziwei, et al. "Progressive Growing of GANs for Improved Quality, Stability, and Variation." Proceedings of the 7th International Conference on Learning Representations (ICLR), 2018.
  13. Bowman, Samuel R., et al. "Generating Sentences from a Continuous Space." Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning (CoNLL'16), 2016, pp. 10-21.
  14. Karras, Tero, et al. "A Style-Based Generator Architecture for Generative Adversarial Networks." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 4401-4410.
  15. Che, Tong, et al. "Mode Regularized Generative Adversarial Networks." Proceedings of the 35th International Conference on Machine Learning (ICML'18), 2018, pp. 878-887.
  16. Zhu, Jun-Yan, et al. "Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks." Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017, pp. 2242-2251.
  17. Brock, Andrew, et al. "Large Scale GAN Training for High Fidelity Natural Image Synthesis." Proceedings of the 7th International Conference on Learning Representations (ICLR), 2018.
  18. Makhzani, Alireza, et al. "Adversarial Autoencoders." Proceedings of the 33rd International Conference on Machine Learning (ICML'16), 2016, pp. 265-273.
  19. Isola, Phillip, et al. "Image-to-Image Translation with Conditional Adversarial Networks." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 5967-5976.
  20. Dai, Bo, and Nevin L. Zhang. "DiVA: Diverse Visual Feature Attribution." Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021, pp. 872-883.