zbMATH — the first resource for mathematics

Sketch recognition by fusion of temporal and image-based features. (English) Zbl 1209.68438
Summary: The increasing availability of pen-based hardware has recently resulted in a parallel growth in sketch-based user interfaces. Sketch-based user interfaces aim to combine the expressive power of free-hand sketching with the processing power of computers. Most sketch-based systems require intelligent ink processing capabilities, which makes the development of robust sketch recognition algorithms a primary concern in the field. So far, the research in sketch recognition has produced various independent approaches to recognition, each of which uses a particular kind of information (e.g., geometric and spatial constraints, image-based features, temporal stroke-ordering patterns). These methods were designed in isolation as stand-alone algorithms, and there has been little work treating various recognition methods as alternative sources of information that can be combined to increase sketch recognition accuracy. In this paper, we focus on two such methods and fuse an image-based method with a time-based method in an attempt to combine the knowledge of how objects look (image data) with the knowledge of how they are drawn (temporal data). In the course of combining spatial and temporal information, we also introduce a mathematically well founded fusion method for combining recognizers. Our combination method can be used for isolated sketch recognition as well as full diagram recognition. Our evaluation with two databases shows that fusing image-based and temporal features yields higher recognition rates. These results are the first to confirm the complementary nature of image-based and temporal recognition methods for full sketch recognition, which has long been suggested, but never supported by data.

MSC:
 68T10 Pattern recognition, speech recognition
Software:
 [1] Willems, D.; Niels, R.; van Gerven, M.; Vuurpijl, L., Iconic and multi-stroke gesture recognition, Pattern recognition, 42, 12, 3303-3312, (2009) · Zbl 1182.68210 [2] Rubine, D., Specifying gestures by example, SIGGRAPH computer graphics, 25, 4, 329-337, (1991) [3] Hammond, T.; Davis, R., LADDER, a sketching language for user interface developers, Computers and graphics, 28, 518-532, (2005) [4] Alvarado, C.; Davis, R., Sketchread: a multi-domain sketch recognition engine, (), 23-32 [5] J.-P. Valois, M. Cote, M. Cheriet, Online recognition of sketched electrical diagrams, in: ICDAR ’01, September 10-13, 2001, pp. 460-464. [6] Mac, S.; Anquetil, E., Eager interpretation of on-line hand-drawn structured documents: the DALI methodology, Pattern recognition, 42, 12, 3202-3214, (2009) · Zbl 1343.68215 [7] A. Hall, C. Pomm, P. Widmayer, A combinatorial approach to multi-domain sketch recognition, in: Eurographics Workshop on Sketch-based Interfaces and Modeling, 2007, pp. 7-14. [8] L. Kara, T. Stahovich, An image-based trainable symbol recognizer for sketch-based interfaces, in: AAAI Fall Symposium Series 2004: Making Pen-Based Interaction Intelligent and Natural, 2004, pp. 501-517. [9] H. Hse, A.R. Newton, Sketched symbol recognition using Zernike moments, in: Proceedings of International Conference on Pattern Recognition, vol. 1, 2004, pp. 367-370. [10] M. Oltmans, Envisioning sketch recognition: a local feature based approach to recognizing informal sketches, Ph.D. Thesis, MIT, Cambridge, MA, May 2007. [11] S. Simhon, G. Dudek, Sketch interpretation and refinement using statistical models, in: Eurographics Symposium on Rendering, 2004, pp. 23-32. [12] W. Jiang, Z.-X. Sun, HMM-based on-line multi-stroke sketch recognition, in: Proceedings of 2005 International Conference on Machine Learning and Cybernetics, vol. 7, August 2005, pp. 4564-4570. [13] D. Anderson, C. Bailey, M. Skubic, Hidden Markov model symbol recognition for sketch-based interfaces, in: AAAI Fall Symposium Series Making Pen-Based Interaction Intelligent and Natural, 2004, pp. 15-21. [14] Sezgin, T.M.; Davis, R., HMM-based efficient sketch recognition, (), 281-283 [15] Rabiner, L., A tutorial on hidden Markov models and selected applications in speech recognition, Proceedings of the IEEE, 77, 2, 257-286, (1989) [16] Rabiner, L.; Juang, B.-H., Fundamentals of speech recognition, (1993), Prentice-Hall, Inc. Upper Saddle River, NJ, USA [17] Kittler, J.; Hatef, M.; Duin, R.P.W.; Matas, J., On combining classifiers, IEEE transactions on pattern analysis and machine intelligence, 20, 3, 226-239, (1998) [18] Xu, L.; Krzyzak, A.; Suen, C., Methods of combining multiple classifiers and their applications to handwriting recognition, IEEE transactions on systems, man and cybernetics, 22, 3, 418-435, (1992) [19] Jaimes, A.; Sebe, N., Multimodal human – computer interaction: a survey, Computer vision and image understanding, 108, 1-2, 116-134, (2007) [20] Rahman, A.; Fairhurst, M., Multiple classifier decision combination strategies for character recognition: a review, International journal on document analysis and recognition, 5, July, 166-194, (2003) [21] C.-C. Chang, C.-J. Lin, LIBSVM: A Library for Support Vector Machines, 2001. Software available at 〈http://www.csie.ntu.edu.tw/∼cjlin/libsvm〉. [22] Vapnik, V., Statistical learning theory, (1998), John Wiley & Sons New York · Zbl 0935.62007 [23] Lin, H.; Lin, C.; Weng, R., A note on Platt’s probabilistic outputs for support vector machines, Machine learning, 68, 3, 267-276, (2007) [24] Scholkopf, B.; Platt, J.; Shawe-Taylor, J.; Smola, A.; Williamson, R., Estimating the support of a high-dimensional distribution, Neural computation, 13, 7, 1443-1471, (2001) · Zbl 1009.62029 [25] L. Vuurpijl, The NicIcon database $$\langle$$http://unipen.nici.ru.nl/NicIcon/〉. [26] F. Manual, 101-5-1, Operational Terms and Graphics, Washington, DC, Department of the Army, vol. 30, 1997. [27] O.E. The Marine Corps Gazette, Tactical decision games symbols $$\langle$$http://www.mca-marines.org/gazette/tdgsym.asp〉. [28] C. Alvarado, M. Lazzareschi, Properties of real-world digital logic diagrams, in: Proceedings of the First International Workshop on Pen-Based Learning Technologies, 2007, pp. 12-24. [29] Chen, J.; Wang, C.; Wang, R., Adaptive binary tree for fast SVM multiclass classification, Neurocomputing, 72, 13-15, 3370-3375, (2009) [30] M. Shilman, H. Pasula, S. Russel, R. Newton, Statistical visual language models for ink parsing, in: Proceedings of the AAAI Spring Symposium on Sketch Understanding, 2002, pp. 126-32. [31] T.A. Hammond, R. Davis, Recognizing interspersed sketches quickly, in: Proceedings of Graphics Interface, Toronto, Ontario, Canada, Canadian Information Processing Society, 2009, pp. 157-166. [32] J. Mas, G. Sanchez, J. Llados, B. Lamiroy, An incremental on-line parsing algorithm for recognizing sketching diagrams, in: International Conference on Document Analysis and Recognition, 2007, pp. 452-456. [33] D. Sharon, M. van de Panne, Constellation models for sketch recognition, in: Eurographics Workshop on Sketch Based Interfaces and Modeling, 2006, pp. 19-26. [34] M. Shilman, P. Viola, Spatial recognition and grouping of text and graphics, in: Eurographics Workshop on Sketch-Based Interfaces and Modeling, 2004, pp. 91-95. [35] Sezgin, T.; Davis, R., Sketch recognition in interspersed drawings using time-based graphical models, Computers and graphics, 32, 5, 500-510, (2008) [36] S. Cates, Combining representations for improved sketch recognition, Ph.D. Thesis, Massachusetts Institute of Technology, September 2009. [37] Gennari, L.; Kara, L.B.; Stahovich, T.F.; Shimada, K., Combining geometry and domain knowledge to interpret hand-drawn diagrams, Computers and graphics, 29, 4, 547-562, (2005) [38] P.J. Cowans, M. Szummer, A graphical model for simultaneous partitioning and labeling, in: International Workshop on Artificial Intelligence and Statistics, 2005, pp. 73-80. [39] Y. Qi, M. Szummer, T.P. Minka, Diagram structure recognition by Bayesian Conditional Random Fields, in: IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, 2005, pp. 191-196. [40] T.M. Sezgin, Generic and HMM based approaches to freehand sketch recognition, in: Proceedings of the MIT Student Oxygen Workshop, 2003. [41] Feng, G.; Viard-Gaudin, C.; Sun, Z., On-line hand-drawn electric circuit diagram recognition using 2D dynamic programming, Pattern recognition, 42, 12, 3215-3223, (2009) · Zbl 1186.68410 [42] D. Arrivault, N. Richard, C. Fernandez-Maloigne, P. Bouyer, Collaboration between statistical and structural approaches for old handwritten characters recognition, in: 5th IAPR Workshop, vol. 3434, Mars 2005, pp. 291-300. · Zbl 1119.68394 [43] Foggia, P.; Sansone, C.; Tortorella, F.; Vento, M., Combining statistical and structural approaches for handwritten character description, Image and vision computing, 17, 9, 701-711, (1999) [44] Steinherz, T.; Rivlin, E.; Intrator, N.; Neskovic, P., An integration of online and pseudo-online information for cursive word recognition, IEEE transactions on pattern analysis and machine intelligence, 27, 5, 669-683, (2005) [45] H. Oda, B. Zhu, J. Tokuno, M. Onuma, A. Kitadai, M. Nakagawa, A compact on-line and off-line combined recognizer, in: 10th International Workshop on Frontiers in Handwriting Recognition, vol. 1, 2006, pp. 133-138. [46] H. Tanaka, K. Nakajima, K. Ishigaki, K. Akiyama, M. Nakagawa, Hybrid pen-input character recognition system based on integration of online – offline recognition, in: International Conference on Document Analysis and Recognition, September 1999, pp. 209-212. [47] M. Liwicki, H. Bunke, HMM-based on-line recognition of handwritten whiteboard notes, in: International Workshop on Frontiers in Handwriting Recognition, 2006, pp. 595-599. [48] M. Liwicki, A. Graves, H. Bunke, J. Schmidhuber, A novel approach to on-line handwriting recognition based on bidirectional long short-term memory networks, in: International Conference on Document Analysis and Recognition, 2007, pp. 367-371. [49] Liwicki, M.; Bunke, H., Combining diverse on-line and off-line systems for handwritten text line recognition, Pattern recognition, 42, 12, 3254-3263, (2009) · Zbl 1182.68198 [50] Liwicki, M.; Bunke, H.; Pittman, J.; Knerr, S., Combining diverse systems for handwritten text line recognition, Machine vision and applications, 1-13, (2009) [51] Belongie, S.; Malik, J.; Puzicha, J., Shape matching and object recognition using shape contexts, IEEE transactions on pattern analysis and machine intelligence, 24, 4, 509-522, (2002) [52] Learned-Miller, E.G., Data driven image models through continuous joint alignment, IEEE transactions on pattern analysis and machine intelligence, 28, 2, 236-250, (2006) [53] L. Bahl, P. Brown, P. de Souza, R. Mercer, Maximum mutual information estimation of hidden Markov model parameters for speech recognition, in: IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 11, 1986. pp. 49-52. [54] M. Collins, Discriminative training methods for hidden markov models: theory and experiments with perceptron algorithms, in: Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing, Morristown, NJ, USA, Association for Computational Linguistics, 2002, pp. 1-8.