×

Non-asymptotic distributional bounds for the Dickman approximation of the running time of the Quickselect algorithm. (English) Zbl 1403.60026

Summary: Given a non-negative random variable \(W\) and \(\theta >0\), let the generalized Dickman transformation map the distribution of \(W\) to that of \[ W^*=_d U^{1/\theta}(W+1), \] where \(U \sim{\mathcal U} [0,1]\), a uniformly distributed variable on the unit interval, independent of \(W\), and where \(=_d\) denotes equality in distribution. It is well known that \(W^*\) and \(W\) are equal in distribution if and only if \(W\) has the generalized Dickman distribution \({\mathcal D}_\theta \). We demonstrate that the Wasserstein distance \(d_1\) between \(W\), a non-negative random variable with finite mean, and \(D_\theta\) having distribution \({\mathcal D}_\theta \) obeys the inequality \[ d_1(W,D_\theta) \leq (1+\theta)d_1(W,W^*). \] The specialization of this bound to the case \(\theta =1\) and coupling constructions yield \( d_1(W_{n,1},D_1) \leq \frac{8\log (n/2)+10} {n}\) for all \(n \geq 1\), where for \(m \geq 1 W_{n,m}=\frac{1} {n}C_{n,m}-1,\) and \(C_{n,m}\) is the number of comparisons made by the Quickselect algorithm to find the \(m^{th}\) smallest element of a list of \(n\) distinct numbers. A similar bound holds for \(W_{n,m}\) for \(m \geq 2\), and together recover and quantify the results of H.-K. Hwang and T.-H. Tsai [Comb. Probab. Comput. 11, No. 4, 353–371 (2002; Zbl 1008.68044)] that show distributional convergence of \(W_{n,m}\) to the standard Dickman distribution \({\mathcal D}_1\) in the asymptotic regime \(m=o(n)\). By comparison to an exact expression for the expected running time \(E[C_{n,m}]\), lower bounds are provided that show the rate is not improvable for \(m \neq 2\).

MSC:

60F05 Central limit and other weak theorems
68P10 Searching and sorting
68Q25 Analysis of algorithms and problem complexity

Citations:

Zbl 1008.68044

Software:

Find
PDFBibTeX XMLCite
Full Text: DOI arXiv Euclid

References:

[1] Richard Arratia, A. D. Barbour, and Simon Tavaré, Logarithmic combinatorial structures: a probabilistic approach, EMS Monographs in Mathematics, European Mathematical Society (EMS), Zürich, 2003. · Zbl 1040.60001
[2] Ehsan Azmoodeh, Benjamin Arras, Guillaume Poly, and Yvik Swan, Distances between probability distributions via characteristic functions and biasing, arXiv preprint:1605.06819 (2016).
[3] A. D. Barbour and Bruno Nietlispach, Approximation by the Dickman distribution and quasi-logarithmic combinatorial structures, Electron. J. Probab. 16 (2011), no. 29, 880–902. · Zbl 1225.60018 · doi:10.1214/EJP.v16-881
[4] Chinmoy Bhattacharjee and Larry Goldstein, Dickman approximation in simulation, summations and perpetuities, to appear in: Bernoulli, arXiv preprint:1706.08192 (2018).
[5] Benjamin Dadoun and Ralph Neininger, A statistical view on exchanges in Quickselect, ANALCO14—Meeting on Analytic Algorithmics and Combinatorics, SIAM, Philadelphia, PA, 2014, pp. 40–51.
[6] Luc Devroye and Omar Fawzi, Simulating the Dickman distribution, Statist. Probab. Lett. 80 (2010), no. 3-4, 242–247. · Zbl 1180.62028 · doi:10.1016/j.spl.2009.10.013
[7] Karl Dickman, On the frequency of numbers containing prime factors of a certain relative magnitude, Arkiv for matematik, astronomi och fysik 22 (1930), no. 10, 1–14. · JFM 56.0178.04
[8] James Allen Fill and Svante Janson, Quicksort asymptotics, J. Algorithms 44 (2002), no. 1, 4–28, Analysis of algorithms.
[9] James Allen Fill and Takéhiko Nakama, Analysis of the expected number of bit comparisons required by Quickselect, Algorithmica 58 (2010), no. 3, 730–769. · Zbl 1202.68128
[10] Larry Goldstein, Non asymptotic distributional bounds for the Dickman approximation of the running time of the Quickselect algorithm, arXiv preprint:1703.00505v1 (2017). · Zbl 1403.60026 · doi:10.1214/18-EJP227
[11] Charles AR Hoare, Algorithm 65: find, Communications of the ACM 4 (1961), no. 7, 321–322.
[12] Hsien-Kuei Hwang and Tsung-Hsi Tsai, Quickselect and the Dickman function, Combin. Probab. Comput. 11 (2002), no. 4, 353–371. · Zbl 1008.68044 · doi:10.1017/S0963548302005138
[13] Donald E. Knuth, Mathematical analysis of algorithms, in Information processing 71 (Proc. IFIP Congress, Ljubljana, 1971), Vol. 1: Foundations and systems, (1972), 19–27.
[14] Ralph Neininger, Private communication, (2017).
[15] Mathew D. Penrose and Andrew R. Wade, Random minimal directed spanning trees and Dickman-type distributions, Adv. in Appl. Probab. 36 (2004), no. 3, 691–714. · Zbl 1068.60023 · doi:10.1239/aap/1093962229
[16] Ross G. Pinsky, On the strange domain of attraction to generalized Dickman distribution for sums of independent random variables, Electron. J. Probab. 23 (2018), Paper No. 3, 17. · Zbl 1390.60092 · doi:10.1214/17-EJP126
[17] Ross G Pinsky, A natural probabilistic model on the integers and its relation to Dickman-type distributions and Buchstab’s function, to appear in: Probability and Analysis in Interacting Physical Systems - in Honor of S.R.S. Varadhan”, Springer, arXiv preprint:1606.02965 (2019).
[18] Svetlozar T. Rachev, Probability metrics and the stability of stochastic models, Wiley Series in Probability and Mathematical Statistics: Applied Probability and Statistics, John Wiley & Sons, Ltd., Chichester, 1991. · Zbl 0744.60004
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.