Arratia, R.; Gordon, L.; Waterman, M. S. The Erdős-Rényi law in distribution, for coin tossing and sequence matching. (English) Zbl 0712.92016 Ann. Stat. 18, No. 2, 539-570 (1990). Starting from an analysis of DNA sequences in molecular biology, the authors are interested in approximations to the distributions of unusually rich matches between two independent sequences of independent, identically distributed letters from finite alphabets. These approximations are described in terms of corresponding distributional results for unusually head-rich regions found in a single random sequence of i.i.d. p-coin tosses. Bounds on the respective total variation distances are derived which converge to zero faster than some negative power of mn, the total number of pairs taken from the two alphabets. The key tools used are large deviation inequalities and the Chen-Stein method of Poisson approximation [L. H. Y. Chen, Ann. Probab. 3, 534-545 (1975; Zbl 0335.60016), and C. M. Stein, Approximate computation of expectations (1986)]. Reviewer: J.Steinebach Cited in 1 ReviewCited in 39 Documents MSC: 92D20 Protein sequences, DNA sequences 62E17 Approximations to statistical distributions (nonasymptotic) 60F10 Large deviations 60C05 Combinatorial probability 60F99 Limit theorems in probability theory Keywords:head-runs; sequence matching; Erdős-Rényi law in distribution; distribution of counts of matches; moving average; scan statistics; analysis of DNA sequences; molecular biology; approximations; independent sequences of independent, identically distributed letters from finite alphabets; coin tosses; total variation distances; Chen-Stein method of Poisson approximation Citations:Zbl 0335.60016 × Cite Format Result Cite Review PDF Full Text: DOI