zbMATH — the first resource for mathematics

Matrix text models. Interpretation and experimental verification of models. (Russian. English summary) Zbl 07277755
Summary: Interpretation of matrix models of texts and text collections is considered. Examples of computationally constructed models of text collections are presented. These examples demonstrate the richness of the simulation results and the possibilities of practical use of the proposed approaches. The original method of experimental verification of the acceptability of text models for solving problems of semantic search and analysis of unstructured text information is described, and the results of the corresponding largescale experiment are presented.
68 Computer science
91 Game theory, economics, finance, and other social and behavioral sciences
Full Text: DOI MNR
[1] D. Mimno, H. Wallach, E. Talley, M. Leenders, A. McCallum, “Optimizing semantic coherence in topic models”, Proc. of Conf. on Empirical Methods in Natural Language (Edinburgh, Scotland, UK, July 27-31, 2011), 262-272
[2] D. Newman, J. H. Lau, K. Grieser, T. Baldwin, “Automatic evaluation of topic coherence”, Human Language Technologies, Annual Conf. of the North American Chapter of the ACL (Los Angeles, California, 2010), 100-108
[3] D. Newman, Y. Noh, E. Talley, S. Karimi S., T. Baldwin, “Evaluating topic models for digi-tal libraries”, Proc. of 10th Annual Joint Conf. on Digital Libraries, ACM, New York, 2010, 215-224
[4] P. Bojanowski, E. Grave, A. Joulin, T. Mikolov, Enriching word vectors with subword information, 2016, 12 pp., arXiv:
[5] M. J. Kusner, Y. Sun, N. I. Kolkin, K. Q. Weinberger, “From Word Embeddings To Document Distances”, Proc. of 32nd Inter. Conf. on Machine Learning (Lille, France, 2015), W&CP, 37, 957-966
[6] G. Huang, Ch. Guo, M. J. Kusner, Y. Sun, K. Q. Weinberger, F. Sha, “Supervised Word Mover”s Distance”, 30th Conf. on Neural Inform. Proc. Syst. (Barcelona, Spain, 2016), 9 pp.
[7] T. Saracevič, “Effects of inconsistent relevance judgments on information retrieval test results: A historical perspective”, LIBRARY TRENDS, 56:4 (2008), 763-783
[8] K. V. Vorontsov, “Additive Regularization for Topic Models of Text Collections”, Doklady Mathematics, 89:3 (2014), 301-304 · Zbl 1358.68242
[9] K. V. Vorontsov, A. A. Potapenko, Additivnaia reguliarizatsiia tematicheskikh modelei, 2014, 22 pp.
[10] M. G. Kreines, E. M. Kreines, “Matrix text models. Text models and similarity of text contents”, MM&CS, 12:5 (2020) · Zbl 1444.68266
[11] M. G. Kreines, E. M. Kreines, “Matrix text models. Text corpora models”, MM&CS, 12:5 (2020) · Zbl 1444.68267
[12] D. Blei, J. Lafferty, “A correlated topic model of Science”, Annals of Applied Statistics, 1 (2007), 17-35 · Zbl 1129.62122
[13] M. G. Kreines, E. M. Kreines, “The control model for the selection of reference collections providing the impartial assessment of the quality of scientific and technological publications by using bibliometric and scientometric indicators”, Journal of Computer and Systems Sciences International, 55:5, 750-766 · Zbl 1384.93018
[14] W. B. Frakes, R. Baeza-Yates, Information Retrieval: Data Structures and Algorithms, Prentice Hall, Englewood Cliffs, New Jersey, 1992, 630 pp.
[15] G. Salton, C. Buckley, “Term-weighting approaches in automatic text retrieval”, Inform. Processing & Management, 24:5 (1988), 513-523
[16] S. E. Robertson, S. Walker, M. Beaulieu, “Experimentation as a way of life: Okapi at TREC”, Inform. Processing & Management, 36 (2000), 95-108
[17] S. Deerwester, S. T. Dumais, G. W. Furnas, T. K. Landauer, R. Harshman, “Indexing by Latent Semantic Analysis”, J. of American Soc. for Inform. Sci., 41:6 (1990), 391-407
[18] D. M. Blei, “Probabilistic topic models”, Communications of the ACM, 55:4 (2012), 77-84
[19] M. Chen, Z. Xu, K. Q. Weinberger, F. Sha, ICML 2012, 2012, 8 pp., arXiv:
[20] A. Perina, N. Jojic, M. Bicego, A. Truski, “Documents as multiple overlapping windows into grids of counts”, NIPS 2013, 10-18
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.