×

Unsupervised domain adaptation with non-stochastic missing data. (English) Zbl 1485.68233

Summary: We consider unsupervised domain adaptation (UDA) for classification problems in the presence of missing data in the unlabelled target domain. More precisely, motivated by practical applications, we analyze situations where distribution shift exists between domains and where some components are systematically absent on the target domain without available supervision for imputing the missing target components. We propose a generative approach for imputation. Imputation is performed in a domain-invariant latent space and leverages indirect supervision from a complete source domain. We introduce a single model performing joint adaptation, imputation and classification which, under our assumptions, minimizes an upper bound of its target generalization error and performs well under various representative divergence families \(( \mathscr{H} \)-divergence, Optimal Transport). Moreover, we compare the target error of our adaptation-imputation framework and the “ideal” target error of a UDA classifier without missing target components. Our model is further improved with self-training, to bring the learned source and target class posterior distributions closer. We perform experiments on three families of datasets of different modalities: a classical digit classification benchmark, the Amazon product reviews dataset both commonly used in UDA and real-world digital advertising datasets. We show the benefits of jointly performing adaptation, classification and imputation on these datasets.

MSC:

68T07 Artificial neural networks and deep learning
62D10 Missing data
62H30 Classification and discrimination; cluster analysis (statistical aspects)
PDF BibTeX XML Cite
Full Text: DOI HAL

References:

[1] Aggarwal K, Yadav P, Selvaraj KS (2019) Domain adaptation in display advertising: An application for partner cold-start. In: Proceedings of the 13th ACM conference on recommender systems, pp. 178-186
[2] Amini, MR; Gallinari, P., Semi-supervised learning with an imperfect supervisor, Knowl Inf Syst, 8, 385-413 (2005)
[3] Arjovsky M, Bottou L, Gulrajani I, Lopez-Paz D (2019) Invariant risk minimization. Arxiv:1907.02893
[4] Barjasteh I, Forsati R, Masrour F, Esfahanian A, Radha H (2015) Cold-start item and user recommendation with decoupled completion and transduction. In: Proceedings of the 9th ACM conference on recommender systems, pp. 91-98
[5] Ben-David, S.; Blitzer, J.; Crammer, K.; Kulesza, A.; Pereira, F.; Vaughan, JW, A theory of learning from different domains, Mach Learn, 79, 1, 151-175 (2010) · Zbl 1470.68081
[6] Blitzer J, McDonald R, Pereira F (2006) Domain adaptation with structural correspondence learning. In: Proceedings of the 2006 conference on empirical methods in natural language processing, pp. 120-128
[7] Bora A, Price E, Dimakis AG (2018) AmbientGAN: Generative models from lossy measurements. In: International conference on learning representations
[8] Cai L, Wang Z, Gao H, Shen D, Ji S (2018) Deep adversarial learning for multi-modality missing data completion. In: Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, pp. 1158-1166
[9] Chen C, Dou Q, Chen H, Qin J, Heng P (2019) Synergistic image and feature adaptation: Towards cross-modality domain adaptation for medical image segmentation. In: Proceedings of the 33rd conference on artificial intelligence (AAAI), pp. 865-872
[10] Chen M, Xu Z, Weinberger KQ, Sha F (2012) Marginalized denoising autoencoders for domain adaptation. In: Proceedings of the 29th international conference on international conference on machine learning, pp. 1627-1634
[11] Cortes, C.; Mohri, M., Domain adaptation and sample bias correction theory and algorithm for regression, Theor Comput Sci, 519, 103-126 (2014) · Zbl 1358.68232
[12] Courty N, Flamary R, Amaury H, Rakotomamonjy A (2017) Joint distribution optimal transportation for domain adaptation. In: Advances in neural information processing systems
[13] Crammer, K.; Kearns, M.; Wortman, J., Learning from multiple sources, J Mach Learn Res, 9, 1757-1774 (2008) · Zbl 1225.68168
[14] Damodaran BB, Kellenberger B (2018) DeepJDOT : Deep joint distribution optimal transport for unsupervised domain adaptation. In: European conference in computer visions, pp. 467-483
[15] Ding Z, Shao M, Fu Y (2014) Latent low-rank transfer subspace learning for missing modality recognition. In: Proceedings of the 28th AAAI conference on artificial intelligence, pp. 1192-1198
[16] Doinychko A, Amini MR (2020) Biconditional gans for multiview learning with missing views. In: Advances in information retrieval, pp. 807-820
[17] Gama, J.; Žliobaitundefined, I.; Bifet, A.; Pechenizkiy, M.; Bouchachia, A., A survey on concept drift adaptation, ACM Comput Surv, 46, 4, 1-37 (2014) · Zbl 1305.68141
[18] Ganin Y, Lempitsky V (2015) Unsupervised domain adaptation by backpropagation. In: Proceedings of the 32nd international conference on machine learning, pp. 1180-1189
[19] Grandvalet Y, Bengio Y (2005) Semi-supervised learning by entropy minimization. In: Proceedings of the 17th international conference on neural information processing systems, pp. 529-536
[20] Hull, JJ, A database for handwritten text recognition research, IEEE Trans Pattern Anal Mach Intell, 16, 5, 550-554 (1994)
[21] Isola P, Zhu JY, Zhou T, Efros A (2017) Image-to-image translation with conditional adversarial networks. In: IEEE Conference on computer vision and pattern recognition, pp. 5967-5976
[22] Johansson FD, Sontag D, Ranganath R (2019) Support and invertibility in domain-invariant representations. In: Proceedings of the 32th international conference on artificial intelligence and statistics, pp. 527-536
[23] LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P., Gradient-based learning applied to document recognition, Proc IEEE, 86, 11, 2278-2324 (1998)
[24] Leek, JT, Tackling the widespread and critical impact of batch effects in high-throughput data, Nat Rev Genet, 11, 10, 733-739 (2010)
[25] Li S, B J, Marlin B (2019) MisGAN: Learning from incomplete data with generative adversarial networks. In: International conference on learning representations
[26] Lipton Z, Wang YX, Smola A (2018) Detecting and correcting for label shift with black box predictors. In: Proceedings of the 35th international conference on machine learning, pp. 3122-3130
[27] Little, R.; Rubin, D., Statistical analysis with missing data (1986), Hoboken: John Wiley, Hoboken · Zbl 1011.62004
[28] Long M, Cao Y, Wang J, Jordan M (2015) Learning transferable features with deep adaptation networks. In: Proceedings of the 32nd international conference on international conference on machine learning, vol 37, pp. 97-105
[29] Long M, Cao Z, Wang J, Jordan MI (2018) Conditional adversarial domain adaptation. In: Advances in neural information processing systems, vol 31
[30] Mattei PA, Frellsen J (2019) MIWAE: Deep generative modelling and imputation of incomplete data. In: Proceedings of the 36th international conference on machine learning, vol 97, pp. 4413-4423
[31] Netzer Y, Wang T, Coates A, Bissacco A, Wu B, Ng AY (2011) NIPS Workshop on deep learning and unsupervised feature learning 2011. In: Proceedings of the IEEE
[32] Pajot A, de Bezenac E, Gallinari P (2019) Unsupervised adversarial image reconstruction. In: International conference on learning representations
[33] Pan, SJ; Yang, Q., A survey on transfer learning, IEEE Trans Knowl Data Eng, 22, 1345-1359 (2010)
[34] Pathak D, Krähenbühl P, Donahue J, Darrell T, Efros A (2016) Context encoders: feature learning by inpainting. In: IEEE Conference on computer vision and pattern recognition, pp. 2536-2544
[35] Peyré, G.; Cuturi, M., Computational optimal transport, Found Trends Mach Learn, 11, 5-6, 355-607 (2019) · Zbl 1475.68011
[36] Rubin, DB, Inference and missing data, Biometrika, 63, 3, 581-592 (1976) · Zbl 0344.62034
[37] Sahebi, S.; Brusilovsky, P., Cross-domain collaborative recommendation in a cold-start context: the impact of user profile size on the quality of recommendation, User modeling, adaptation, and personalization, 289-295 (2013), Berlin: Springer, Berlin
[38] Shen J, Qu Y, Zhang W, Yu Y (2018) Wasserstein distance guided representation learning for domain adaptation. In: 32nd AAAI Conference on artificial intelligence
[39] Tran L, Liu X, Zhou J, Jin R (2017) Missing modalities imputation via cascaded residual autoencoder. In: IEEE Conference on computer vision and pattern recognition, pp. 4971-4980
[40] Tzeng E, Hoffman J, Saenko K, Darrell T (2017) Adversarial discriminative domain adaptation. IEEE Conference on computer vision and pattern recognition pp. 2962-2971
[41] Van Buuren, S., Flexible imputation of missing data (2018), London: Chapman and Hall/CRC, London · Zbl 1416.62030
[42] Wang C, Niepert M, Li H (2018) LRMM: Learning to recommend with missing modalities. In: Proceedings of the 2018 conference on empirical methods in natural language processing, pp. 3360-3370
[43] Wang R, Fu B, Fu G, Wang M (2017) Deep cross network for ad click predictions. In: Proceedings of the ADKDD’17
[44] Wei P, Ke Y, Goh CK (2017) Domain specific feature transfer for hybrid domain adaptation. In: 2017 IEEE International conference on data mining, pp. 1027-1032
[45] Wei, P.; Ke, Y.; Goh, CK, A general domain specific feature transfer framework for hybrid domain adaptation, IEEE Trans Knowl Data Eng, 31, 8, 1440-1451 (2019)
[46] Yoon J, Jordon J, Van Der Schaar M (2018) GAIN: Missing data imputation using generative adversarial nets. In: Proceedings of the 35th international conference on machine learning, pp. 5689-5698
[47] You K, Wang X, Long M, Jordan M (2019) Towards accurate model selection in deep unsupervised domain adaptation. In: Proceedings of the 36th international conference on machine learning, pp. 7124-7133
[48] Zablocki E, Bordes P, Soulier L, Piwowarski B, Gallinari P (2019) Context-aware zero-shot learning for object recognition. In: Proceedings of the 36th international conference on machine learning, vol 97, pp. 7292-7303
[49] Zhao H, des Combes RT, Zhang K, Gordon GJ (2019) On learning invariant representation for domain adaptation. In: Proceedings of the 36th international conference on machine learning, vol 97, pp. 7523-7532
[50] Zhu JY, Park T, Isola P, Efros A (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. In: IEEE International conference on computer vision, pp. 2242-2251
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.