Optimal detection of the feature matching map in presence of noise and outliers. (English) Zbl 07633946

Summary: We consider the problem of finding the matching map between two sets of \(d\)-dimensional vectors from noisy observations, where the second set contains outliers. The matching map is then an injection, which can be consistently detected only if the vectors of the second set are well separated. The main result shows that, in the high-dimensional setting, a detection region of unknown injection may be characterized by the sets of vectors for which the inlier-inlier distance is of order at least \(d^{1/4}\) and the inlier-outlier distance is of order at least \(d^{1/2}\). These rates are achieved using the matching minimizing the sum of logarithms of distances between matched pairs of points. We also prove lower bounds establishing optimality of these rates. Finally, we report the results of numerical experiments on both synthetic and real world data that illustrate our theoretical results and provide further insight into the properties of the algorithms studied in this work.


62H12 Estimation in multivariate analysis
62F35 Robustness and adaptive procedures (parametric inference)
Full Text: arXiv Link


[1] Azaïs, J. M. and de Castro, Y. (2020). Multiple testing and variable selec-tion along least angle regression’s path.
[2] Bai, X., Luo, Z., Zhou, L., Fu, H., Quan, L., and Tai, C.-L. (2020). D3feat: Joint learning of dense detection and description of 3d local features. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[3] Blanchard, G., Carpentier, A., and Gutzeit, M. (2018). Minimax Euclidean separation rates for testing convex hypotheses in R d . Electron. J. Stat., 12(2):3713-3735. MR3873534 · Zbl 1402.60126
[4] Burnashev, M. V. (1979). On the minimax detection of an inaccurately known signal in a white Gaussian noise background. Theory Probab. Appl., 24:107-119. MR0522240 · Zbl 0433.60043
[5] Calonder, M., Lepetit, V., Strecha, C., and Fua, P. (2010). Brief: Binary robust independent elementary features. In Daniilidis, K., Maragos, P., and Paragios, N., editors, Computer Vision -ECCV 2010, pages 778-792, Berlin, Heidelberg. Springer Berlin Heidelberg.
[6] Carpentier, A., Collier, O., Comminges, L., Tsybakov, A. B., and Wang, Y. (2019). Minimax rate of testing in sparse linear regression. Automation and Remote Control, 80(10):1817-1834. MR4032408 · Zbl 1456.62083
[7] Chen, J., Tian, J., Lee, N., Zheng, J., Smith, R. T., and Laine, A. F. (2010). A partial intensity invariant feature descriptor for multimodal retinal image registration. IEEE Transactions on Biomedical Engineering, 57(7):1707-1718.
[8] Collier, O. (2012). Minimax hypothesis testing for curve registration. In AISTATS 2012, volume 22 of JMLR Proceedings, pages 236-245. MR2988441 · Zbl 1334.62077
[9] Collier, O. and Dalalyan, A. S. (2013). Permutation estimation and min-imax rates of identifiability. Journal of Machine Learning Research, W & CP 31 (AI-STATS 2013):10-19.
[10] Collier, O. and Dalalyan, A. S. (2016). Minimax rates in permutation esti-mation for feature matching. The Journal of Machine Learning Research, 17(1):162-192. MR3482926 · Zbl 1360.62262
[11] Comminges, L. and Dalalyan, A. S. (2012). Tight conditions for consis-tency of variable selection in the context of high dimensionality. Ann. Stat., 40(5):2667-2696. MR3097616 · Zbl 1373.62154
[12] Comminges, L. and Dalalyan, A. S. (2013). Minimax testing of a composite null hypothesis defined via a quadratic functional in the model of regression. Electron. J. Stat., 7:146-190. MR3020417 · Zbl 1337.62090
[13] Duff, I. S. and Koster, J. (2001). On algorithms for permuting large entries to the diagonal of a sparse matrix. SIAM Journal on Matrix Analysis and Applications, 22(4):973-996. MR1824052 · Zbl 0979.05087
[14] Ermakov, M. S. (1990). Minimax detection of a signal in Gaussian white noise. Teor. Veroyatnost. i Primenen., 35(4):704-715. MR1090496 · Zbl 0724.62082
[15] Flammarion, N., Mao, C., and Rigollet, P. (2019). Optimal rates of statis-tical seriation. Bernoulli, 25(1):623-653. MR3892331 · Zbl 1442.62084
[16] Gao, C. and Zhang, A. Y. (2019). Iterative algorithm for discrete structure recovery. arXiv preprint arXiv:1911.01018. MR4404929
[17] Harwood, B. and Drummond, T. (2016). Fanng: Fast approximate near-est neighbour graphs. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 5713-5722.
[18] Ingster, Y. I. (1982). Minimax nonparametric detection of signals in white Gaussian noise. Probl. Inf. Transm., 18:130-140. MR0689340 · Zbl 0499.94002
[19] Ingster, Y. I. and Suslina, I. A. (2003). Nonparametric goodness-of-fit test-ing under Gaussian models, volume 169 of Lecture Notes in Statistics. Springer-Verlag, New York. MR1991446 · Zbl 1013.62049
[20] Itseez (2015). Open source computer vision library. https://github.com/ itseez/opencv.
[21] Jiang, Z., Xie, L., Deng, X., Xu, W., and Wang, J. (2016). Fast near-est neighbor search in the hamming space. In Tian, Q., Sebe, N., Qi, G.-J., Huet, B., Hong, R., and Liu, X., editors, MultiMedia Modeling, pages 325-336, Cham. Springer International Publishing.
[22] Jin, Y., Mishkin, D., Mishchuk, A., Matas, J., Fua, P., Yi, K. M., and Trulls, E. (2020). Image Matching across Wide Baselines: From Paper to Practice. International Journal of Computer Vision.
[23] Juditsky, A. and Nemirovski, A. (2020). Statistical inference via convex optimization. Princeton, NJ: Princeton University Press. MR4264425 · Zbl 1433.62006
[24] Kuhn, H. W. (1955). The Hungarian Method for the Assignment Problem. Naval Research Logistics Quarterly, 2(1-2):83-97. MR0075510 · Zbl 0143.41905
[25] Kuhn, H. W. (2010). The Hungarian Method for the Assignment Problem, pages 29-47. Springer Berlin Heidelberg, Berlin, Heidelberg. · Zbl 1187.90015
[26] Laurent, B. and Massart, P. (2000). Adaptive estimation of a quadratic functional by model selection. Ann. Statist., 28(5):1302-1338. MR1805785 · Zbl 1105.62328
[27] Lepski, O. V. and Tsybakov, A. B. (2000). Asymptotically exact nonpara-metric hypothesis testing in sup-norm and at a fixed point. Probability Theory and Related Fields, 117(1):17-48. MR1759508 · Zbl 0971.62022
[28] Lowe, D. G. (2004). Distinctive image features from scale-invariant key-points. International journal of computer vision, 60(2):91-110.
[29] Ma, J., Jiang, X., Fan, A., Jiang, J., and Yan, J. (2021). Image match-ing from handcrafted to deep features: A survey. International Journal of Computer Vision, 129. MR4202233 · Zbl 1483.68439
[30] Ma, R., Tony Cai, T., and Li, H. (2020). Optimal permutation recovery in permuted monotone matrix model. Journal of the American Statistical Association, pages 1-15. MR4309278
[31] Malkov, Y. A. and Yashunin, D. A. (2020). Efficient and robust approx-imate nearest neighbor search using hierarchical navigable small world graphs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(4):824-836.
[32] Mao, C., Pananjady, A., and Wainwright, M. J. (2020). Towards optimal es-timation of bivariate isotonic matrices with unknown permutations. Annals of Statistics, 48(6):3183-3205. MR4185805 · Zbl 1490.62129
[33] Mao, C., Weed, J., and Rigollet, P. (2018). Minimax rates and efficient al-gorithms for noisy sorting. In Algorithmic Learning Theory, pages 821-847. PMLR. MR3857331 · Zbl 1407.62066
[34] Munkres, J. (1957). Algorithms for the assignment and transportation problems. Journal of the Society for Industrial and Applied Mathematics, 5(1):32-38. MR0093429 · Zbl 0131.36604
[35] Ndaoud, M. and Tsybakov, A. B. (2020). Optimal variable selec-tion and adaptive noisy compressed sensing. IEEE Trans. Inf. Theory, 66(4):2517-2532. MR4087700 · Zbl 1448.94081
[36] Pananjady, A. and Samworth, R. J. (2020). Isotonic regression with unknown permutations: Statistics, computation, and adaptation. arXiv preprint arXiv:2009.02609. MR4382019 · Zbl 1486.62119
[37] Pananjady, A., Wainwright, M. J., and Courtade, T. A. (2017). Lin-ear regression with shuffled data: Statistical and computational lim-its of permutation recovery. IEEE Transactions on Information Theory, 64(5):3286-3300. MR3798377 · Zbl 1395.62204
[38] Ramdas, A., Isenberg, D., Singh, A., and Wasserman, L. A. (2016). Mini-max lower bounds for linear independence testing. In IEEE International Symposium on Information Theory, ISIT 2016, Barcelona, Spain, July 10-15, 2016, pages 965-969. IEEE.
[39] Rublee, E., Rabaud, V., Konolige, K., and Bradski, G. (2011). Orb: An efficient alternative to sift or surf. In 2011 International Conference on Computer Vision, pages 2564-2571.
[40] Shah, N. B., Balakrishnan, S., and Wainwright, M. J. (2021). A permutation-based model for crowd labeling: Optimal estimation and ro-bustness. IEEE Transactions on Information Theory, 67(6):4162-4184. MR4289370 · Zbl 1475.62170
[41] Slawski, M. and Ben-David, E. (2019). Linear regression with sparsely per-muted data. Electronic Journal of Statistics, 13(1):1-36. MR3896144 · Zbl 1416.62398
[42] Tian, Y., Balntas, V., Ng, T., Barroso-Laguna, A., Demiris, Y., and Mikolajczyk, K. (2020). D2d: Keypoint extraction with describe to de-tect approach. In Proceedings of the Asian Conference on Computer Vision (ACCV).
[43] Tsybakov, A. B. (2009). Introduction to nonparametric estimation. Springer Series in Statistics. Springer, New York. MR2724359 · Zbl 1176.62032
[44] Virtanen, P., Gommers, R., Oliphant, T. E., Haberland, M., Reddy, T., Cournapeau, D., Burovski, E., Peterson, P., Weckesser, W., Bright, J., van der Walt, S. J., Brett, M., Wilson, J., Millman, K. J., Mayorov, N., Nelson, A. R. J., Jones, E., Kern, R., Larson, E., Carey, C. J., Polat, İ., Feng, Y., Moore, E. W., VanderPlas, J., Laxalde, D., Perktold, J., Cimrman, R., Henriksen, I., Quintero, E. A., Harris, C. R., Archibald, A. M., Ribeiro, A. H., Pedregosa, F., van Mulbregt, P., and SciPy 1.0 Contributors (2020).
[45] SciPy 1.0: Fundamental Algorithms for Scientific Computing in Python.
[46] T. Galstyan et al.
[47] Nature Methods, 17:261-272.
[48] Wang, K., Zhu, N., Cheng, Y., Li, R., Zhou, T., and Long, X. (2018). Fast feature matching based on r -nearest k -means searching. CAAI Transac-tions on Intelligence Technology, 3(4):198-207.
[49] Wang, X. (2011). A fast exact k-nearest neighbors algorithm for high di-mensional search using k-means clustering and triangle inequality. In The 2011 International Joint Conference on Neural Networks, pages 1293-1299.
[50] Wei, Y. and Wainwright, M. J. (2020). The local geometry of testing in ellipses: Tight control via localized kolmogorov widths. IEEE Transactions on Information Theory, 66(8):5110-5129. MR4130665 · Zbl 1446.62126
[51] Wei, Y., Wainwright, M. J., and Guntuboyina, A. (2019a). The geometry of hypothesis testing over convex cones: Generalized likelihood ratio tests and minimax radii. The Annals of Statistics, 47(2):994-1024. MR3909958 · Zbl 1415.62006
[52] Wei, Y., Wainwright, M. J., and Guntuboyina, A. (2019b). The geometry of hypothesis testing over convex cones: Generalized likelihood ratio tests and minimax radii. The Annals of Statistics, 47(2):994-1024. MR3909958 · Zbl 1415.62006
[53] Wolfer, G. and Kontorovich, A. (2020). Minimax testing of identity to a ref-erence ergodic markov chain. In AISTATS 2020, volume 108 of Proceedings of Machine Learning Research, pages 191-201. MR3932874
[54] Xing, X., Liu, M., Ma, P., and Zhong, W. (2020). Minimax nonparamet-ric parallelism test. Journal of Machine Learning Research, 21(94):1-47. MR4119162 · Zbl 1502.62079
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.