Maximal coordinate discrepancy as accuracy criterion of image projective normalization for optical recognition of documents.

*(English)*Zbl 07293401Summary: Application of projective normalization (a special case of orthocorrection and perspective correction) to photographs of documents for their further optical recognition is generally accepted. In this case, inaccuracies of normalization can lead to recognition errors. To date, a number of normalization accuracy criteria are presented in the literature, but their conformity with recognition quality was not investigated. In this paper, for the case of a fixed structured document, we justify a uniform probabilistic model of recognition errors, according to which the probability of correct recognition of a character abruptly falls to zero with an increase in the coordinate discrepancy of this character. For this model, we prove that the image normalization accuracy criterion, which is equal to the maximal coordinate discrepancy in the text fields of a document, monotonously depends on the probability of correct recognition of the entire document. Also, we show that the problem on computing the maximal coordinate discrepancy is not reduced to the nearest known one, i.e. the linear-fractional programming problem. Finally, for the first time, we obtain an analytical solution to the problem on computing the maximal coordinate discrepancy on a union of polygons.

##### MSC:

90C25 | Convex programming |

90C32 | Fractional programming |

90C90 | Applications of mathematical programming |

##### Keywords:

orthocorrection; perspective correction; image projective normalization; optical character recognition; accuracy criteria; coordinate discrepancy; nonlinear programming##### References:

[1] | Kunina I. A., Terekhin A. P., Gladilin S. A., Nikolaev D. P., “Blind Radial Distortion Compensation from Video Using Fast Hough Transform”, ICRMV 2016, Proc. SPIE, 2017, 1025308, 1-7 |

[2] | Shapiro L., Stokman D., Boguslavskiy A. A., Sokolov S. M., Computer Vision, BINOM. Laboratoriya znaniy, M., 2013 (in Russian) |

[3] | Putjatin E. P., Prokopenko D. O., Pechenaja E. M., “Image Normalization Issues in Projective Transformations”, Electronics and Informatics, 2:3 (1998), 82-86 (in Russian) |

[4] | Zeynalov R., Velizhev A., Konushin A., “Recovering the Shape of a Page of Text for Correcting Geometric Distortions”, Proceedings of the 19 International Conference GraphiCon-2009 (Moscow, 2009), 125-128 (in Russian) |

[5] | A. Zhukovsky, D. Nikolaev, V. Arlazarov, V. Postnikov, D. Polevoy, N. Skoryukina, T. Chernov, J. Shemiakina, A. Mukovozov, I. Konovalenko, “Segments Graph-Based Approach for Document Capture in a Smartphone Video Stream”, ICDAR 2017, v. 1, IEEE Computer Society, 2018, 337-342 |

[6] | Bolotova J. A., Spicyn V. G., Osina P. M., “An Overview of the Algorithms for Detecting Text Areas in Images and Videos”, Computer Optics, 41:3 (2017), 441-452 (in Russian) |

[7] | Shemiakina J. A., Zhukovsky A. E., Faradjev I. A., “The Research of the Algorithms of a Projective Transformation Calculation in the Problem of Planar Object Targeting by Feature Points”, Artificial Intelligence and Decision Making, 2017:1 (2017), 43-49 (in Russian) |

[8] | N. Skoryukina, J. Shemiakina, V.L. Arlazarov, I. Faradjev, “Document Localization Algorithms Based on Feature Points and Straight Lines”, ICMV 2017, Proc. SPIE, 2018, 106961H, 1-8 |

[9] | M.A. Povolotskiy, E.G. Kuznetsova, T.M. Khanipov, “Russian License Plate Segmentation Based on Dynamic Time Warping”, Proceedings ECMS 2017, 2017, 285-291 |

[10] | N.S. Skoryukina, T.S. Chernov, K.B. Bulatov, D.P. Nikolaev, V.L. Arlazarov, “Snapscreen: TV-Stream Frame Search with Projectively Distorted and Noisy Query”, ICMV 2016, Proc. SPIE, 2017, 103410, 1-5 |

[11] | Youye Xie, Gongguo Tang, W. Hoff, “Geometry-Based Populated Chessboard Recognition”, Tenth International Conference on Machine Vision (ICMV 2017), Proc. SPIE, 2018, 1069603, 1-5 |

[12] | C.S. Arvind, R. Mishra, K.Vishal, V. Gundimeda, “Vision Based Speed Breaker Detection for Autonomous Vehicle”, Tenth International Conference on Machine Vision (ICMV 2017), Proc. SPIE, 2018, 106960E, 1-9 · Zbl 1420.91368 |

[13] | M.P. Dubuisson, A.K. Jain, “A Modified Hausdorff Distance for Object Matching”, Proceedings of 12th International Conference On Pattern Recognition, v. 1, 1994, 566-568 |

[14] | D.G. Sim, O.K. Kwon, R.H. Park, “Object Matching Algorithms Using Robust Hausdorff Distance Measures”, IEEE Transactions on Image Processing, 8:3 (1999), 425-429 |

[15] | C. Orrite, J.E. Herrero, “Shape Matching of Partially Occluded Curves Invariant Under Projective Transformation”, Computer Vision and Image Understanding, 93:1 (2004), 34-64 |

[16] | Nikolayev P. P., “Projectively Invariant Description of Non-Planar Smooth Figures. 1. Preliminary Analysis of the Problem”, Sensor System, 30:4 (2016), 290-311 (in Russian) |

[17] | Balickiy A. M., Savchik A. V., Gafarov R. F., Konovalenko I. A., “About Design-Invariant Points of an Oval with a Distinguished External Line”, Information Transfer Issues, 53:3 (2017), 84-89 (in Russian) · Zbl 1390.51007 |

[18] | Savchik A. V., Nikolaev P. P., “Projective Matching Method for Ovals with Two Marked Points”, Information Technology and Computing Systems, 2018:1 (2018), 60-67 (in Russian) |

[19] | Katamanov S. N., “MTSAT-1R Automatic Geostationary Satellite Image Linking”, Modern Problems of Remote Sensing of the Earth from Space, 1:4 (2007), 63-68 (in Russian) |

[20] | S. Karpenko, I. Konovalenko, A. Miller, B. Miller, D. Nikolaev, “UAV Control on the Basis of 3D Landmark Bearing-Only Observations”, Sensors, 15:12 (2015), 29802-29820 |

[21] | Holopov I. S., “Projection Distortion Correction Algorithm for Low-Altitude Shooting”, Computer Optics, 41:2 (2017), 284-290 (in Russian) |

[22] | G.E. Legge, D.G. Pelli, G.S. Rubin, M.M. Schleske, “Psychophysics of Reading. I. Normal Vision”, Vision Research, 25:2 (1985), 239-252 |

[23] | Kunina I. A., Gladilin S. A., Nikolaev D. P., “Blind Radial Distortion Compensation in a Single Image Using Fast Hough Transform”, Computer Optics, 40:3 (2016), 395-403 (in Russian) |

[24] | Arlazarov V. V., Slavin O. A.E., Uskov A. V.E., Janiszewskinn I. M., “Modelling the Flow of Character Recognition Results in Video Stream”, Bulletin of the South Ural State University. Series: Mathematical Modelling, Programming and Computer Software, 11:2 (2018), 14-28 · Zbl 1400.94007 |

[25] | M. Avriel, Nonlinear Programming: Analysis and Methods, Courier Corporation, North Chelmsford, 2003 |

[26] | A. Charnes, W. W. Cooper, “Programming with Linear Fractional Functionals”, Naval Research Logistics Quarterly, 9:3-4 (1962), 181-186 · Zbl 0127.36901 |

[27] | L. Boyd, Convex Optimization, Cambridge University Press, Cambridge, 2004 · Zbl 1058.90049 |

[28] | A. Biswas, S. Verma, D.B. Ojha, “Optimality and Convexity Theorems for Linear Fractional Programming Problem”, International Journal of Computational and Applied Mathematics, 12:3 (2017), 911-916 |

[29] | Judin D. B., Mathematical Control Methods in Conditions of Incomplete Information, Izdatel’skaya gruppa URSS, M., 2010 (in Russian) |

[30] | Rokafellar R., Convex Analysis, Mir, M., 1973 (in Russian) |

This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.