×

Large alphabets and incompressibility. (English) Zbl 1185.68367

Summary: We briefly survey some concepts related to empirical entropy-normal numbers, de Bruijn sequences and Markov processes-and investigate how well it approximates Kolmogorov complexity. Our results suggest \(\ell \)th-order empirical entropy stops being a reasonable complexity metric for almost all strings of length \(m\) over alphabets of size \(n\) about when \(n^{\ell }\) surpasses \(m\).

MSC:

68Q30 Algorithmic information theory (Kolmogorov complexity, etc.)
68Q45 Formal languages and automata
PDFBibTeX XMLCite
Full Text: DOI arXiv

References:

[1] Becher, V.; Figueira, S., An example of a computable absolutely normal number, Theoretical Computer Science, 270, 947-958 (2002) · Zbl 1002.11064
[2] Borel, É., Les probabilités dénombrables et leur applications arithmétiques, Rendiconti del Circolo Matematico di Palermo, 27, 247-271 (1909) · JFM 40.0283.01
[3] M. Burrows, D.J. Wheeler, A block-sorting lossless data compression algorithm, Technical Report 24, Digital Equipment Corporation, 1994; M. Burrows, D.J. Wheeler, A block-sorting lossless data compression algorithm, Technical Report 24, Digital Equipment Corporation, 1994
[4] Chaitin, G. J., On the length of programs for computing finite binary sequences: Statistical considerations, Journal of the ACM, 16, 145-159 (1969) · Zbl 0187.28303
[5] Champernowne, D. G., The construction of decimals normal in the scale of 10, Journal of the London Mathematical Society, 8, 254-260 (1933) · Zbl 0007.33701
[6] Copeland, A. H.; Erdös, P., Note on normal numbers, Bulletin of the American Mathematical Society, 52, 857-860 (1946) · Zbl 0063.00962
[7] de Bruijn, N. G., A combinatorial problem, Koninklijke Nederlandse Akademie van Wetenschappen, 49, 758-764 (1946) · Zbl 0060.02701
[8] P. Ferragina, G. Manzini, V. Mäkinen, G. Navarro, Compressed representations of sequences and full-text indexes, ACM Transactions on Algorithms, submitted for publication; P. Ferragina, G. Manzini, V. Mäkinen, G. Navarro, Compressed representations of sequences and full-text indexes, ACM Transactions on Algorithms, submitted for publication · Zbl 1321.68263
[9] Gagie, T., Compressing probability distributions, Information Processing Letters, 97, 133-136 (2006) · Zbl 1184.68249
[10] R. Grossi, A. Gupta, J.S. Vitter, An algorithmic framework for compression and text indexing, submitted for publication; R. Grossi, A. Gupta, J.S. Vitter, An algorithmic framework for compression and text indexing, submitted for publication
[11] Hagerup, T.; Rüb, C., A guided tour of Chernoff bounds, Information Processing Letters, 33, 305-308 (1990) · Zbl 0702.60021
[12] Kalai, G.; Safra, S., Threshold phenomena and influence, (Percus, A. G.; Istrate, G.; Moore, C., Computational Complexity and Statistical Physics (2006), Oxford University Press: Oxford University Press Oxford) · Zbl 1156.82317
[13] Kolmogorov, A. N., Three approaches to the quantitative definition of information, Problems in Information Transmission, 1, 1-7 (1965) · Zbl 0271.94018
[14] Kullback, S.; Leibler, R. A., On information and sufficiency, Annals of Mathematical Statistics, 22, 79-86 (1951) · Zbl 0042.38403
[15] Li, M.; Vitányi, P., An Introduction to Kolmogorov Complexity and its Applications (1997), Springer-Verlag: Springer-Verlag Berlin · Zbl 0866.68051
[16] Manzini, G., An analysis of the Burrows-Wheeler Transform, Journal of the ACM, 48, 407-430 (2001) · Zbl 1323.68262
[17] Motwani, R.; Raghavan, P., Randomized Algorithms (1995), Cambridge University Press: Cambridge University Press Cambridge · Zbl 0849.68039
[18] Munro, J. I.; Spira, P. M., Sorting and searching in multisets, SIAM Journal on Computing, 5, 1-8 (1976) · Zbl 0324.68018
[19] Rosenfeld, V. R., Enumerating De Bruijn sequences, MATCH Communications in Mathematical and in Computer Chemistry, 45, 71-83 (2002) · Zbl 1030.05006
[20] Shannon, C. E., A mathematical theory of communication, Bell System Technical Journal, 27, 379-423 (1948), 623-656 · Zbl 1154.94303
[21] Sierpinski, M. W., Démonstration élémentaire du théorème de M. Borel sur les nombres absolument normaux et détermination d’un tel nombre, Bulletin de la Société Mathématiques de France, 45, 127-132 (1917) · JFM 46.0276.02
[22] Sleator, D. D.; Tarjan, R. E., Self-adjusting binary search trees, Journal of the ACM, 32, 652-686 (1985) · Zbl 0631.68060
[23] Solomonoff, R. J., A formal theory of inductive inference, Information and Control, 7, 1-22 (1964), 224-254 · Zbl 0258.68045
[24] Turing, A. M., A note on normal numbers, (Britton, J. L., Collected Works of A.M. Turing: Pure Mathematics (1992), North-Holland: North-Holland Amsterdam), 117-119
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.