# zbMATH — the first resource for mathematics

Computation of exact bootstrap confidence intervals: complexity and deterministic algorithms. (English) Zbl 1445.90025
Summary: The bootstrap is a nonparametric approach for calculating quantities, such as confidence intervals, directly from data. Since calculating exact bootstrap quantities is believed to be intractable, randomized resampling algorithms are traditionally used. In this paper, we present a new perspective on the bootstrap method through the lens of counting integer points in polyhedra. Through this new perspective, we make several advances for the bootstrap method, both theoretically and algorithmically. First, we establish several computational complexity results for the exact bootstrap method in the case of the sample mean. Second, we present the first efficient deterministic approximation algorithm (fully polynomial time approximation scheme) for producing exact bootstrap confidence intervals which, unlike traditional methods, has guaranteed bounds on the approximation error. Third, we develop a simple exact algorithm for exact bootstrap confidence intervals based on polynomial multiplication. We provide empirical evidence on real and synthetic data sets with several hundreds of data points that the proposed deterministic algorithms can quickly produce confidence intervals that are substantially more accurate than those from randomized methods, and thus are practical alternatives in applications such as clinical trials.
##### MSC:
 90B25 Reliability, availability, maintenance, inspection in operations research 62M20 Inference from stochastic processes and prediction 90C59 Approximation methods and heuristics in mathematical programming
gmp; LattE; MPFR
Full Text:
##### References:
  Arora S, Barak B (2009) Computational Complexity: A Modern Approach, 1st ed. (Cambridge University Press, Cambridge, UK).Crossref, Google Scholar · Zbl 1193.68112  Barvinok AI (1994) A polynomial time algorithm for counting integral points in polyhedra when the dimension is fixed. Math. Oper. Res. 19(4):769-779.Link, Google Scholar · Zbl 0821.90085  Bertsimas D, Weismantel R (2005) Optimization Over Integers (Dynamic Ideas, Nashville, TN).Google Scholar  Bertsimas D, Allison K, Pulleyblank WR (2016) The Analytics Edge (Dynamic Ideas, Nashville, TN).Google Scholar  Bertsimas D, Johnson M, Kallus N (2015) The power of optimization over randomization in designing experiments involving small samples. Oper. Res. 63(4):868-876.Link, Google Scholar · Zbl 1329.90078  Chan KY, Lee SM (2001) An exact iterated bootstrap algorithm for small-sample bias reduction. Comput. Statist. Data Anal. 36(1):1-13.Crossref, Google Scholar · Zbl 1080.62516  Cormen TH, Leiserson CE, Rivest RL, Stein C (2009) Introduction to Algorithms, vol. 3 (MIT Press, Cambridge, MA).Google Scholar  De Loera JA, Hemmecke R, Tauzer J, Yoshida R (2004) Effective lattice point counting in rational convex polytopes. J. Symbolic Comput. 38(4):1273-1302.Crossref, Google Scholar · Zbl 1137.52303  DiCiccio T, Efron B (1992) More accurate confidence limits in exponential families. Biometrika 79(2):231-245.Crossref, Google Scholar · Zbl 0752.62027  DiCiccio T, Efron B (1996) Bootstrap confidence intervals. Statist. Sci. 11(3):189-212.Crossref, Google Scholar · Zbl 0955.62574  Durrett R (2010) Probability: Theory and Examples (Cambridge University Press, Cambridge, UK).Crossref, Google Scholar · Zbl 1202.60001  Dyer M (2003) Approximate counting by dynamic programming. Proc. 35th Annual ACM Sympos. Theory Comput. (ACM, New York), 693-699.Google Scholar · Zbl 1192.90242  Dyer M, Stougie L (2006) Computational complexity of stochastic programming problems. Math. Programming 106(3):423-432.Crossref, Google Scholar · Zbl 1134.90027  Dyer M, Kannan R, Mount J (1997) Sampling contingency tables. Random Structures Algorithms 10(4):487-506.Crossref, Google Scholar · Zbl 0884.62065  Dyer M, Frieze A, Kannan R, Kapoor A, Perkovic L, Vazirani U (1993) A mildly exponential time algorithm for approximating the number of solutions to a multidimensional knapsack problem. Combinatorics Probab. Comput. 2(3):271-284.Crossref, Google Scholar · Zbl 0819.90094  Dyer ME, Frieze AM (1988) On the complexity of computing the volume of a polyhedron. SIAM J. Comput. 17(5):967-974.Crossref, Google Scholar · Zbl 0668.68049  Efron B (1979) Bootstrap methods: Another look at the jackknife. Ann. Statist. 7(1):1-26.Crossref, Google Scholar · Zbl 0406.62024  Efron B (1982) The Jackknife, the Bootstrap and Other Resampling Plans (SIAM, Philadelphia).Crossref, Google Scholar  Efron B (1987) Better bootstrap confidence intervals. J. Amer. Statist. Assoc. 82(397):171-185.Crossref, Google Scholar · Zbl 0622.62039  Efron B, Tibshirani RJ (1994) An Introduction to the Bootstrap (CRC Press, New York).Crossref, Google Scholar  Evans DL, Leemis LM, Drew JH (2006) The distribution of order statistics for discrete random variables with applications to bootstrapping. INFORMS J. Comput. 18(1):19-30.Link, Google Scholar  Fisher NI, Hall P (1991) Boostrap algorithms for small samples. J. Statist. Planning Inference 27(2):157-169.Crossref, Google Scholar  Fousse L, Hanrot G, Lefèvre V, Pélissier P, Zimmermann P (2007) MPFR: A multiple-precision binary floating-point library with correct rounding. ACM Trans. Math. Software 33(2):13-15.Crossref, Google Scholar · Zbl 1365.65302  Fürer M (2009) Faster integer multiplication. SIAM J. Comput. 39(3):979-1005.Crossref, Google Scholar · Zbl 1192.68926  Garey MR, Johnson DS (1990) Computers and Intractability: A Guide to the Theory of NP-Completeness (W. H. Freeman & Co., New York).Google Scholar  Gathen JVZ, Gerhard J (2013) Modern Computer Algebra, 3rd ed. (Cambridge University Press, New York).Crossref, Google Scholar · Zbl 1277.68002  Gopalan P, Klivans A, Meka R, Stefankovic D, Vempala S, Vigoda E (2011) An FPTAS for #knapsack and related counting problems. Proc. IEEE 52nd Annual Sympos. Foundations Comput. Sci. (FOCS) (IEEE, Piscataway, NJ), 817-826.Google Scholar · Zbl 1292.68167  Granlund T (2017) The GNU multiple precision arithmetic library. Accessed August 1, 2017, http://gmplib.org.Google Scholar  Hall P (1988) Theoretical comparison of bootstrap confidence intervals. Ann. Statist. 16(3):927-953.Crossref, Google Scholar · Zbl 0663.62046  Halman N (2016) A deterministic fully polynomial time approximation scheme for counting integer knapsack solutions made easy. Theoret. Comput. Sci. 645:41-47.Crossref, Google Scholar · Zbl 1348.68295  Halman N, Klabjan D, Mostagir M, Orlin J, Simchi-Levi D (2009) A fully polynomial-time approximation scheme for single-item stochastic inventory control with discrete demand. Math. Oper. Res. 34(3):674-685.Link, Google Scholar · Zbl 1231.90030  Hanasusanto GA, Kuhn D, Wiesemann W (2016) A comment on “Computational complexity of stochastic programming problems.” Math. Programming 159(1-2):557-569.Crossref, Google Scholar · Zbl 1345.90063  Harris JM, Hirst JL, Mossinghoff MJ (2008) Combinatorics and Graph Theory, 2nd ed. (Springer-Verlag, New York).Crossref, Google Scholar  Harvey D (2009) Faster polynomial multiplication via multipoint Kronecker substitution. J. Symbolic Comput. 44(10):1502-1510.Crossref, Google Scholar · Zbl 1194.68265  Huang J (1991) Efficient computation of the performance of bootstrap and jackknife estimators of the variance of L-statistics. J. Statist. Comput. Simulation 38(1-4):45-56.Crossref, Google Scholar · Zbl 0800.62232  Hutson AD, Ernst MD (2000) The exact bootstrap mean and variance of an L-estimator. J. Royal Statist. Soc. B Statist. Methodology 62(1):89-94.Crossref, Google Scholar · Zbl 0941.62048  Jerrum M, Sinclair A (1996) The Markov chain Monte Carlo method: An approach to approximate counting and integration. Hochbaum DS, ed., Approximation Algorithms for NP-Hard Problems (PWS Publishing, Boston), 482-520.Google Scholar  Kisielinska J (2013) The exact bootstrap method shown on the example of the mean and variance estimation. Comput. Statist. 28(3):1061-1077.Crossref, Google Scholar · Zbl 1305.65050  Kleinberg J, Rabani Y, Tardos É (1997) Allocating bandwidth for bursty connections. SIAM J. Comput. 30(1):191-215.Crossref, Google Scholar · Zbl 0979.05098  Kronecker L (1882) Grundzuge einer arithmetischen Theorie der algebraischen Grössen. J. Reine Angewandte Math. 1(92):1-122.Google Scholar · JFM 14.0038.02  Lasserre JB (2009) Linear and Integer Programming vs Linear Integration and Counting (Springer-Verlag, New York).Crossref, Google Scholar · Zbl 1188.90170  Li J, Shi T (2014) A fully polynomial-time approximation scheme for approximating a sum of random variables. Oper. Res. Lett. 42(3):197-202.Crossref, Google Scholar · Zbl 1408.90259  Linial N (1986) Hard enumeration problems in geometry and combinatorics. SIAM J. Algebraic Discrete Methods 7(2):331-335.Crossref, Google Scholar · Zbl 0596.68041  Nesterov Y (2004) Fast Fourier transform and its applications to integer knapsack problems. Discussion Paper 2004/64, CORE, Louvain-la-Neuve, Belgium.Google Scholar  Schönhage A, Strassen V (1971) Schnelle multiplikation großer zahlen. Comput. 7(3-4):281-292.Crossref, Google Scholar · Zbl 0223.68007  Štefankovic D, Vempala S, Vigoda E (2012) A deterministic polynomial-time approximation scheme for counting knapsack solutions. SIAM J. Comput. 41(2):356-366.Crossref, Google Scholar · Zbl 1253.68182  Valiant LG (1979a) The complexity of computing the permanent. Theoret. Comput. Sci. 8(2):189-201.Crossref, Google Scholar · Zbl 0415.68008  Valiant LG (1979b) The complexity of enumeration and reliability problems. SIAM J. Comput. 8(3):410-421.Crossref, · Zbl 0419.68082
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.