An upper bound on the convergence rate of a second functional in optimal sequence alignment. (English) Zbl 1429.60017

Summary: Consider finite sequences \(X_{[1,n]}=X_{1}\dots,X_{n}\) and \(Y_{[1,n]}=Y_{1}\dots,Y_{n}\) of length \(n\), consisting of i.i.d. samples of random letters from a finite alphabet, and let \(S\) and \(T\) be chosen i.i.d. randomly from the unit ball in the space of symmetric scoring functions over this alphabet augmented by a gap symbol. We prove a probabilistic upper bound of linear order in \((\ln(n))^{1/4}n^{3/4}\) for the deviation of the score relative to \(T\) of optimal alignments with gaps of \(X_{[1,n]}\) and \(Y_{[1,n]}\) relative to \(S\). It remains an open problem to prove a lower bound. Our result contributes to the understanding of the microstructure of optimal alignments relative to one given scoring function, extending a theory begun in [the first two authors, J. Stat. Phys. 153, No. 3, 512–529 (2013; Zbl 1315.60011)].


60C05 Combinatorial probability
60F05 Central limit and other weak theorems
60F10 Large deviations
60D99 Geometric probability and stochastic geometry


Zbl 1315.60011
Full Text: DOI arXiv Euclid