×

Discriminator feature-based inference by recycling the discriminator of GANs. (English) Zbl 1483.68334

Summary: Generative adversarial networks (GANs) successfully generate high quality data by learning a mapping from a latent vector to the data. Various studies assert that the latent space of a GAN is semantically meaningful and can be utilized for advanced data analysis and manipulation. To analyze the real data in the latent space of a GAN, it is necessary to build an inference mapping from the data to the latent vector. This paper proposes an effective algorithm to accurately infer the latent vector by utilizing GAN discriminator features. Our primary goal is to increase inference mapping accuracy with minimal training overhead. Furthermore, using the proposed algorithm, we suggest a conditional image generation algorithm, namely a spatially conditioned GAN. Extensive evaluations confirmed that the proposed inference algorithm achieved more semantically accurate inference mapping than existing methods and can be successfully applied to advanced conditional image generation tasks.

MSC:

68T07 Artificial neural networks and deep learning
68U10 Computing methodologies for image processing
PDFBibTeX XMLCite
Full Text: DOI arXiv

References:

[1] Baldi, P. (2012). Autoencoders, unsupervised learning, and deep architectures. In Proceedings of ICML workshop on unsupervised and transfer learning (pp. 37-49).
[2] Bang, D., & Shim, H. (2018). Improved training of generative adversarial networks using representative features. In International conference on machine learning.
[3] Berthelot, D., Schumm, T., & Metz, L. (2017). Began: Boundary equilibrium generative adversarial networks. arXiv preprint arXiv:1703.10717.
[4] Brock, A., Donahue, J., & Simonyan, K. (2018). Large scale GAN training for high fidelity natural image synthesis. arXiv preprint arXiv:1809.11096.
[5] Byrd, RH; Lu, P.; Nocedal, J.; Zhu, C., A limited memory algorithm for bound constrained optimization, SIAM Journal on Scientific Computing, 16, 5, 1190-1208 (1995) · Zbl 0836.65080 · doi:10.1137/0916069
[6] Donahue, J., Krähenbühl, P., & Darrell, T. (2017). Adversarial feature learning. In International conference on learning representations.
[7] Dowson, D.; Landau, B., The Fréchet distance between multivariate normal distributions, Journal of Multivariate Analysis, 12, 3, 450-455 (1982) · Zbl 0501.62038 · doi:10.1016/0047-259X(82)90077-X
[8] Dumoulin, V., Belghazi, I., Poole, B., Lamb, A., Arjovsky, M., Mastropietro, O., et al. (2017). Adversarially learned inference. In International conference on learning representations.
[9] Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., & Bengio Y. (2014). Generative adversarial nets. In Advances in neural information processing systems (pp. 2672-2680).
[10] Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., & Courville, A. C. (2017). Improved training of Wasserstein GANs. In Advances in neural information processing systems (pp. 5769-5779).
[11] He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770-778).
[12] Iandola, F. N., Han, S., Moskewicz, M. W., Ashraf, K., Dally, W. J., & Keutzer, K. (2016). Squeezenet: AlexNet-level accuracy with 50x fewer parameters and \(<0.5\) mb model size. arXiv preprint arXiv:1602.07360.
[13] Iizuka, S.; Simo-Serra, E.; Ishikawa, H., Globally and locally consistent image completion, ACM Transactions on Graphics (TOG), 36, 4, 107 (2017) · doi:10.1145/3072959.3073659
[14] Isola, P., Zhu, J. Y., Zhou, T., & Efros, A. A. (2017). Image-to-image translation with conditional adversarial networks. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR) (pp. 5967-5976). IEEE.
[15] Kingma, D. P., & Ba, J. (2015). Adam: A method for stochastic optimization. In International conference on learning representations.
[16] Kingma, D. P., & Welling, M. (2013). Auto-encoding variational Bayes. International conference on learning representations.
[17] Krizhevsky, A. (2014). One weird trick for parallelizing convolutional neural networks. arXiv preprint arXiv:1404.5997.
[18] Krizhevsky, A., & Hinton, G. (2009). Learning multiple layers of features from tiny images. Technical report. Citeseer.
[19] Larsen, A. B. L., Sønderby, S. K., Larochelle, H., & Winther, O. (2015). Autoencoding beyond pixels using a learned similarity metric. arXiv preprint arXiv:1512.09300.
[20] Li, C., Liu, H., Chen, C., Pu, Y., Chen, L., Henao, R., & Carin, L. (2017). Alice: Towards understanding adversarial learning for joint distribution matching. In Advances in neural information processing systems (pp. 5495-5503).
[21] Liu, M. Y., & Tuzel, O. (2016). Coupled generative adversarial networks. In Advances in neural information processing systems (pp. 469-477).
[22] Liu, M., Ding, Y., Xia, M., Liu, X., Ding, E., Zuo, W., & Wen, S. (2019). STGAN: A unified selective transfer network for arbitrary image attribute editing. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3673-3682).
[23] Liu, Z., Luo, P., Wang, X., & Tang, X. (2015). Deep learning face attributes in the wild. In Proceedings of the IEEE international conference on computer vision (pp. 3730-3738).
[24] Lucic, M.; Kurach, K.; Michalski, M.; Gelly, S.; Bousquet, O.; Bengio, S.; Wallach, H.; Larochelle, H.; Grauman, K.; Cesa-Bianchi, N.; Garnett, R., Are GANs created equal? A large-scale study, Advances in neural information processing systems, 700-709 (2018), Red Hook: Curran Associates, Inc., Red Hook
[25] Makhzani, A., Shlens, J., Jaitly, N., Goodfellow, I., & Frey, B. (2016). Adversarial autoencoders. International conference on learning representations.
[26] Mao, X., Li, Q., Xie, H., Lau, R. Y., Wang, Z., & Smolley, S. P. (2017). Least squares generative adversarial networks. In 2017 IEEE international conference on computer vision (ICCV) (pp. 2813-2821). IEEE.
[27] Mescheder, L., Geiger, A., & Nowozin, S. (2018). Which training methods for GANs do actually converge? In International conference on machine learning (pp. 3478-3487).
[28] Miyato, T., Kataoka, T., Koyama, M., & Yoshida, Y. (2018). Spectral normalization for generative adversarial networks. arXiv preprint arXiv:1802.05957.
[29] Radford, A., Metz, L., & Chintala, S. (2016). Unsupervised representation learning with deep convolutional generative adversarial networks. In International conference on learning representations.
[30] Simonyan, K., & Zisserman, A. (2015). Very deep convolutional networks for large-scale image recognition. In International conference learning representations.
[31] Srivastava, A., Valkoz, L., Russell, C., Gutmann, M. U., & Sutton, C. (2017). Veegan: Reducing mode collapse in GANs using implicit variational learning. In Advances in neural information processing systems (pp. 3310-3320).
[32] Wainwright, MJ; Jordan, MI, Graphical models, exponential families, and variational inference, Foundations and Trends® in Machine Learning, 1, 1-2, 1-305 (2008) · Zbl 1193.62107 · doi:10.1561/2200000001
[33] Warde-Farley, D., & Bengio, Y. (2017). Improving generative adversarial networks with denoising feature matching. In International conference on learning representations.
[34] Wu, Y., & He, K. (2018). Group normalization. In Proceedings of the European conference on computer vision (ECCV) (pp. 3-19).
[35] Xiao, H., Rasul, K., & Vollgraf, R. (2017). Fashion-MNIST: A novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747.
[36] Zhang, H., Goodfellow, I., Metaxas, D., & Odena, A. (2018a). Self-attention generative adversarial networks. arXiv preprint arXiv:1805.08318.
[37] Zhang, R., Isola, P., Efros, A. A., Shechtman, E., & Wang, O. (2018b) The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 586-595).
[38] Zhang, W., Sun, J., & Tang, X. (2008). Cat head detection-how to effectively exploit shape and texture features. In European conference on computer vision (pp. 802-816). Berlin: Springer.
[39] Zheng, C., Cham, T. J., & Cai, J. (2019). Pluralistic image completion. arXiv preprint arXiv:1903.04227.
[40] Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., & Torralba, A. (2016). Learning deep features for discriminative localization. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2921-2929).
[41] Zhu, J. Y., Krähenbühl, P., Shechtman, E., & Efros, A. A. (2016). Generative visual manipulation on the natural image manifold. In European conference on computer vision. Berlin: Springer.
[42] Zhu, J. Y., Park, T., Isola, P., & Efros, A. A. (2017). Unpaired image-to-image translation using cycle-consistent adversarial networks. In 2017 IEEE international conference on computer vision (ICCV).
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.