×

Cross-media retrieval using query dependent search methods. (English) Zbl 1210.68058

Summary: The content-based cross-media retrieval is a new type of multimedia retrieval in which the media types of query examples and the returned results can be different. In order to learn the semantic correlations among multimedia objects of different modalities, the heterogeneous multimedia objects are analyzed in the form of MultiMedia Document (MMD), which is a set of multimedia objects that are of different media types but carry the same semantics. We first construct an MMD semi-semantic graph by jointly analyzing the heterogeneous multimedia data. After that, Cross-Media Indexing Space (CMIS) is constructed. For each query, the optimal dimension of CMIS is automatically determined and the cross-media retrieval is performed on a per-query basis. By doing this, the most appropriate retrieval approach for each query is selected, i.e. different search methods are used for different queries. The query dependent search methods make cross-media retrieval performance not only accurate but also stable. We also propose different learning methods of relevance feedback to improve the performance. Experiment is encouraging and validates the proposed methods.

MSC:

68P20 Information storage and retrieval of data
68P10 Searching and sorting
68T10 Pattern recognition, speech recognition

Software:

Isomap
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Lew, M.; Sebe, N.; Djeraba, C.; Jain, R., Content-based multimedia information retrieval: state-of-the-art and challenges, ACM Transactions on Multimedia Computing, Communication, and Applications, 2, 1, 1-19 (2006)
[2] Zhang, R.; Zhang, Z., Effective image retrieval based on hidden concept discovery in image database, IEEE Transaction on Image Processing, 16, 2, 562-572 (2006)
[3] X. He, W.Y. Ma, H.J. Zhang, Learning an image manifold for retrieval, in: Proceedings of ACM Multimedia Conference, 2004.; X. He, W.Y. Ma, H.J. Zhang, Learning an image manifold for retrieval, in: Proceedings of ACM Multimedia Conference, 2004.
[4] Guo, G.; Li, S. Z., Content-based audio classification and retrieval by support vector machines, IEEE Transactions on Neural Networks, 14, 1, 209-215 (2003)
[5] Greenspan, H.; Goldberger, J.; Mayer, A., Probabilistic space-time video modeling via piecewise GMM, IEEE Transactions on Pattern Analysis and Machine Intelligence, 26, 3, 384-396 (2004)
[6] Fan, J.; Elmagarmid, A. K.; Zhu, X.; Aref, W. G.; Wu, L., ClassView: hierarchical video shot classification, indexing, and accessing, IEEE Transactions on Multimedia, 6, 1, 70-86 (2004)
[7] M. Müller, T. Röder, M. Clausen, Efficient content-based retrieval of motion capture data, in Proceedings of ACM SIGGRAPH Conference, 2005.; M. Müller, T. Röder, M. Clausen, Efficient content-based retrieval of motion capture data, in Proceedings of ACM SIGGRAPH Conference, 2005.
[8] Bimbo, A. D.; Pala, P., Content-based retrieval of 3D models, ACM Transactions on Multimedia Computing, Communications, and Applications, 2, 1, 20-43 (2006)
[9] J. Assfalg, A.D. Bimbo, P. Pala, Retrieval of 3D objects by visual similarity, in: Proceedings of the 6th International Workshop on Multimedia Information Retrieval, 2004.; J. Assfalg, A.D. Bimbo, P. Pala, Retrieval of 3D objects by visual similarity, in: Proceedings of the 6th International Workshop on Multimedia Information Retrieval, 2004.
[10] M. Haas, J. Rijsdam, B. Thomee, M. Lew, Relevance feedback: perceptual learning and retrieval in bio-computing, photos, and video, in: Proceedings of the 6th ACM SIGMM International Workshop on Multimedia Information Retrieval, 2004.; M. Haas, J. Rijsdam, B. Thomee, M. Lew, Relevance feedback: perceptual learning and retrieval in bio-computing, photos, and video, in: Proceedings of the 6th ACM SIGMM International Workshop on Multimedia Information Retrieval, 2004.
[11] Y. Chen, D. Che and K. Aberer, On the efficient evaluation of relaxed queries in biological databases, in: Proceedings of the 11th International Conference on Information and Knowledge Management, 2002.; Y. Chen, D. Che and K. Aberer, On the efficient evaluation of relaxed queries in biological databases, in: Proceedings of the 11th International Conference on Information and Knowledge Management, 2002.
[12] Zhuang, Y.; Yang, Y.; Wu, F., Mining semantic correlation of heterogeneous multimedia data for cross-media retrieval, IEEE Transactions on Multimedia, 10, 2, 221-229 (2008)
[13] Wu, P.; Choi, Y.; Ro, Y. M.; Won, C. S., MPEG-7 texture descriptors, International Journal of Image and Graphics, 1, 3, 547-563 (2001) · Zbl 0998.68610
[14] J. Xu, Query expansion using local and global document analysis, in: Proceedings of the Nineteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 1996.; J. Xu, Query expansion using local and global document analysis, in: Proceedings of the Nineteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 1996.
[15] Efthimiadis, N. E., Query expansion, In Annual Review of Information Systems and Technology, 31, 121-187 (1996)
[16] Seung, H. S.; Lee, D., The manifold ways of perception, Science, 290, 22 (2000)
[17] Tenenbaum, J. B.; Silva, V. D.; Langford, J. C., A global geometric framework for nonlinear dimensionality reduction, Science, 290, 22 (2000)
[18] Belkin, M.; Niyogi, P., Laplacian eigenmaps and spectral techniques for embedding and clustering, Neural Computation, 1373-1396 (2003) · Zbl 1085.68119
[19] Kruskal, J. B.; Wish, M., Multidimensional Scaling (1977), Sage Publications: Sage Publications Beverly Hills, CA
[20] Balasubramanian, M.; Schwartz, E. L.; Tenenbaum, J. B.; de Silva, V.; Langford, J. C., The isomap algorithm and topological stability, Science, 295, (5552) (2002)
[21] Lin, J., Divergence measures based on the shannon entropy, IEEE Transactions on Information Theory, 37, 145-151 (1991) · Zbl 0712.94004
[22] Y. Rui, T.S. Huang, Optimizing learning in image retrieval, in: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2000.; Y. Rui, T.S. Huang, Optimizing learning in image retrieval, in: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2000.
[23] Zhuang, Y.; Yang, Y.; Wu, F.; Pan, Y., Manifold learning based cross-media retrieval: a solution to media object complementary nature, Journal of VLSI Signal Process, 46, 153-164 (2007)
[24] R. Typke, F. Wiering, R.C. Veltkamp, A survey of music information retrieval systems, in: Proceedings of ISMIR, 2005.; R. Typke, F. Wiering, R.C. Veltkamp, A survey of music information retrieval systems, in: Proceedings of ISMIR, 2005.
[25] Yang, Y.; Zhuang, Y.; Wu, F.; Pan, Y., Harmonizing hierarchical manifolds for multimedia document semantics understanding and cross-media retrieval, IEEE Transactions on Multimedia, 10, 3, 437-446 (2008)
[26] Fisher, R. A., The use of multiple measurements in taxonomic problems, Annals of Eugenics, 7, 179-188 (1936)
[27] Nie, F.; Xiang, S.; Song, Y.; Zhang, C., Extracting the optimal dimensionality for local tensor discriminant analysis, Pattern Recognition, 42, 1, 105-114 (2009) · Zbl 1159.68545
[28] 〈http://encarta.msn.com; 〈http://encarta.msn.com
[29] Roweis, S.; Saul, L., Nonlinear dimensionality reduction by locally linear embedding, Science, 290, 5500, 2323-2326 (2000)
[30] Y. Yang, Y. Zhuang, D. Xu, Y. Pan, D. Tao, S. Maybank. Retrieval based interactive cartoon synthesis via unsupervised bi-distance metric learning, ACM MM, 2009.; Y. Yang, Y. Zhuang, D. Xu, Y. Pan, D. Tao, S. Maybank. Retrieval based interactive cartoon synthesis via unsupervised bi-distance metric learning, ACM MM, 2009.
[31] Y. Yang, D. Xu, F. Nie, J. Luo, Y. Zhuang. Ranking with local regression and global alignment for cross-media retrieval, ACM MM, 2009.; Y. Yang, D. Xu, F. Nie, J. Luo, Y. Zhuang. Ranking with local regression and global alignment for cross-media retrieval, ACM MM, 2009.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.