×

Sequential change point detection in high dimensional time series. (English) Zbl 1493.62524

Summary: Change point detection in high dimensional data has found considerable interest in recent years. Most of the literature either designs methodology for a retrospective analysis, where the whole sample is already available when the statistical inference begins, or considers online detection schemes controlling the average time until a false alarm. This paper takes a different point of view and develops monitoring schemes for the online scenario, where high dimensional data arrives successively and the goal is to detect changes as fast as possible controlling at the same time the probability of a type I error of a false alarm. We develop a sequential procedure capable of detecting changes in the mean vector of a successively observed high dimensional time series with spatial and temporal dependence. The statistical properties of the method are analyzed in the case where both, the sample size and dimension tend to infinity. In this scenario, it is shown that the new monitoring scheme has asymptotic level alpha under the null hypothesis of no change and is consistent under the alternative of a change in at least one component of the high dimensional mean vector. The approach is based on a new type of monitoring scheme for one-dimensional data which turns out to be often more powerful than the usually used CUSUM and Page-CUSUM methods, and the component-wise statistics are aggregated by the maximum statistic. For the analysis of the asymptotic properties of our monitoring scheme we prove that the range of a Brownian motion on a given interval is in the domain of attraction of the Gumbel distribution, which is a result of independent interest in extreme value theory. The finite sample properties of the new methodology are illustrated by means of a simulation study and in the analysis of a data example.

MSC:

62M10 Time series, auto-correlation, regression, etc. in statistics (GARCH)
62G20 Asymptotic properties of nonparametric inference
62H15 Hypothesis testing in multivariate analysis
60G70 Extreme value theory; extremal stochastic processes
PDFBibTeX XMLCite
Full Text: DOI arXiv Link

References:

[1] Anatolyev, S. and Kosenok, G. (2018). Sequential testing with uniformly distributed size. Journal of Time Series Econometrics, 10(2). · Zbl 1499.62280
[2] Andrews, D. (1991). Heteroskedasticity and autocorrelation consistent covariance matrix estimation. Econometrica, 59(3):817-858. · Zbl 0732.62052
[3] Arratia, R., Goldstein, L., and Gordon, L. (1989). Two moments suffice for Poisson approximations: The Chen-Stein method. The Annals of Probability, 17(1):9-25. · Zbl 0675.60017
[4] Aue, A., Hörmann, S., Horváth, L., Hušková, M., and Steinebach, J. G. (2012). Sequential testing for the stability of high-frequency portfolio betas. Econometric Theory, 28(4):804-837. · Zbl 1245.91089
[5] Aue, A. and Horváth, L. (2004). Delay time in sequential detection of change. Statistics & Probability Letters, 67(3):221-231. · Zbl 1059.62085
[6] Aue, A., Horváth, L., Hušková, M., and Kokoszka, P. (2006). Change-point monitoring in linear models. The Econometrics Journal, 9(3):373-403. · Zbl 1106.62067
[7] Avanesov, V. and Buzun, N. (2018). Change-point detection in high-dimensional covariance structure. Electronic Journal of Statistics, 12(2):3254-3294. · Zbl 1454.62252
[8] Berkes, I., Liu, W., and Wu, W. B. (2014). Komlós-Major-Tusnády approximation under dependence. The Annals of Probability, 42(2):794-817. · Zbl 1308.60037
[9] Billingsley, P. (1999). Convergence of Probability Measures. Wiley Series in Probability and Statistics. John Wiley & Sons, Inc., New York, second edition. · Zbl 0944.60003
[10] Borodin, A. and Salminen, P. (1996). Handbook of Brownian Motion - Facts and Formulae. Probability and its applications. Birkhäuser, Basel. · Zbl 0859.60001
[11] Chen, Y., Wang, T., and Samworth, R. J. (2020). High-dimensional, multiscale online changepoint detection. arXiv e-print arXiv:2003.03668v1.
[12] Chen, Z. and Tian, Z. (2010). Modified procedures for change point monitoring in linear models. Mathematics and Computers in Simulation, 81(1):62-75. · Zbl 1201.62073
[13] Chernozhukov, V., Chetverikov, D., and Kato, K. (2013). Gaussian approximations and multiplier bootstrap for maxima of sums of high-dimensional random vectors. The Annals of Statistics, 41(6):2786-2819. · Zbl 1292.62030
[14] Chernozhukov, V., Chetverikov, D., Kato, K., and Koike, Y. (2019). Improved central limit theorem and bootstrap approximations in high dimensions. arXiv e-print arXiv:1912.10529v1.
[15] Cho, H. and Fryzlewicz, P. (2015). Multiple-change-point detection for high dimensional time series via sparsified binary segmentation. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 77(2):475-507. · Zbl 1414.62356
[16] Chochola, O., Hušková, M., Prášková, Z., and Steinebach, J. G. (2013). Robust monitoring of capm portfolio betas. Journal of Multivariate Analysis, 115:374-395. · Zbl 1271.62057
[17] Chu, C.-S. J., Stinchcombe, M., and White, H. (1996). Monitoring structural change. Econometrica, 64(5):1045-1065. · Zbl 0856.90027
[18] Chu, L. and Chen, H. (2018). Sequential change-point detection for high-dimensional and non-euclidean data. arXiv e-print arXiv:1810.05973v1.
[19] Ciuperca, G. (2013). Two tests for sequential detection of a change-point in a nonlinear model. Journal of Statistical Planning and Inference, 143(10):1719-1743. · Zbl 1432.62272
[20] de Haan, L. and Ferreira, A. (2006). Extreme Value Theory: An Introduction. Springer Series in Operations Research and Financial Engineering. Springer Science+Business Media, New York. · Zbl 1101.62002
[21] Dette, H., Bornkamp, B., and Bretz, F. (2013). On the efficiency of two-stage response-adaptive designs. Statistics in Medicine, 32(10):1646-1660.
[22] Dette, H. and Gösmann, J. (2018). Relevant change points in high dimensional time series. Electronic Journal of Statistics, 12(2):2578-2636. · Zbl 1403.62158
[23] Dette, H. and Gösmann, J. (2019). A likelihood ratio approach to sequential change point detection for a general class of parameters. Journal of the American Statistical Association. Preprint available at: https://doi.org/10.1080/01621459.2019.1630562. · Zbl 1441.62212
[24] Dette, H., Pan, G. M., and Yang, Q. (2018). Estimating a change point in a sequence of very high-dimensional covariance matrices. arXiv e-print arXiv:1807.10797v1.
[25] El Machkouri, M., Volný, D., and Wu, W. B. (2013). A central limit theorem for stationary random fields. Stochastic Processes and their Applications, 123(1):1-14. · Zbl 1308.60025
[26] Embrechts, P., Klüppelberg, C., and Mikosch, T. (1997). Modelling Extremal Events for Insurance and Finance, volume 33 of Applications of Mathematics (New York). Springer, Berlin. · Zbl 0873.62116
[27] Enikeeva, F. and Harchaoui, Z. (2019). High-dimensional change-point detection under sparse alternatives. The Annals of Statistics, 47(4):2051-2079. · Zbl 1427.62036
[28] Fan, J., Liao, Y., and Yao, J. (2015). Power enhancement in high-dimensional cross-sectional tests. Econometrica, 83(4):1497-1541. · Zbl 1410.62201
[29] Feller, W. (1951). The asymptotic distribution of the range of sums of independent random variables. The Annals of Mathematical Statistics, 22(3):427-432. · Zbl 0043.34201
[30] Fremdt, S. (2014). Asymptotic distribution of the delay time in Page’s sequential procedure. Journal of Statistical Planning and Inference, 145:74-91. · Zbl 1432.62273
[31] Fremdt, S. (2015). Page’s sequential procedure for change-point detection in time series regression. Statistics, 49(1):128-155. · Zbl 1395.62267
[32] Gösmann, J., Kley, T., and Dette, H. (2020). A new approach for open-end sequential change point monitoring. To appear in: Journal of Time Series Analysis. Preprint available at https://doi.org/10.1111/jtsa.12555. · Zbl 1468.62338
[33] Hawkins, D. M., Qiu, P., and Kang, C. W. (2003). The changepoint model for statistical process control. Journal of Quality Technology, 35(4):355-366.
[34] Hinkley, D. V. (1971). Inference about the change-point from cumulative sum tests. Biometrika, 58(3):509-523. · Zbl 0254.62019
[35] Hoga, Y. (2017). Monitoring multivariate time series. Journal of Multivariate Analysis, 155:105-121. · Zbl 1359.62369
[36] Horváth, L., Hušková, M., Kokoszka, P., and Steinebach, J. (2004). Monitoring changes in linear models. Journal of Statistical Planning and Inference, 126(1):225-251. · Zbl 1075.62054
[37] Hušková, M. and Kirch, C. (2012). Bootstrapping sequential change-point tests for linear regression. Metrika, 75(5):673-708. · Zbl 1362.62161
[38] Hušková, M. and Koubková, A. (2005). Monitoring jump changes in linear models. Journal of Statistical Research, 39(2):51-70.
[39] Jiang, T. (2004). The asymptotic distributions of the largest entries of sample correlation matrices. The Annals of Applied Probability, 14(2):865-880. · Zbl 1047.60014
[40] Jirak, M. (2015a). Supplement to: Uniform change point tests in high dimension. The Annals of Statistics, 43(6):2451-2483. Supplement availabe at: https://doi.org/10.1214/15-AOS1347SUPP. · Zbl 1327.62467
[41] Jirak, M. (2015b). Uniform change point tests in high dimension. The Annals of Statistics, 43(6):2451-2483. · Zbl 1327.62467
[42] Karatzas, I. and Shreve, S. (1991). Brownian motion and stochastic calculus, volume 113 of Graduate Texts in Mathematics. Springer-Verlag, New York, second edition. · Zbl 0734.60060
[43] Kaul, A., Jandhyala, V. K., and Fotopoulos, S. B. (2019). An efficient two step algorithm for high dimensional change point regression models without grid search. J. Mach. Learn. Res., 20:Paper No. 111, 40. · Zbl 1434.62151
[44] Kirch, C. (2008). Bootstrapping sequential change-point tests. Sequential Analysis, 27(3):330-349. · Zbl 1145.62060
[45] Kirch, C. and Kamgaing, J. T. (2015). On the use of estimating functions in monitoring time series for change points. Journal of Statistical Planning and Inference, 161:25-49. · Zbl 1311.62122
[46] Kirch, C. and Stoehr, C. (2019). Sequential change point tests based on U-statistics. arXiv e-print arXiv:1912.08580v1.
[47] Kirch, C. and Weber, S. (2018). Modified sequential change point procedures based on estimating functions. Electronic Journal of Statistics, 12(1):1579-1613. · Zbl 1392.62241
[48] Kock, A. B. and Preinerstorfer, D. (2019). Power in high-dimensional testing problems. Econometrica, 87(3):1055-1069. · Zbl 1420.62253
[49] Lai, T. L. (2001). Sequential analysis: Some classical problems and new challenges. Statistica Sinica, 11(2):303-351.
[50] Lévy-Leduc, C. and Roueff, F. (2009). Detection and localization of change-points in high-dimensional network traffic data. Annals of Applied Statistics, 3(2):637-662. · Zbl 1166.62094
[51] Liu, W., Xiao, H., and Wu, W. B. (2013). Probability and moment inequalities under dependence. Statistica Sinica, 23(3):1257-1272. · Zbl 06202706
[52] Mei, Y. (2008). Is average run length to false alarm always an informative criterion? Sequential Analysis, 27(4):354-376. · Zbl 1149.62070
[53] Mei, Y. (2010). Efficient scalable schemes for monitoring a large number of data streams. Biometrika, 97(2):419-433. · Zbl 1406.62088
[54] Moustakides, G. V. (1986). Optimal stopping times for detecting changes in distributions. The Annals of Statistics, 14(4):1379-1387. · Zbl 0612.62116
[55] Nikiforov, I. (1987). Sequential detection of changes in stochastic systems. IFAC Proceedings Volumes, 20(2):321-327.
[56] Otto, S. and Breitung, J. (2019). Backward CUSUM for testing and monitoring structural change. arXiv e-print arXiv:2003.02682v1.
[57] Page, E. S. (1954). Continuous inspection schemes. Biometrika, 41(1/2):100-115. · Zbl 0056.38002
[58] Page, E. S. (1955). Control charts with warning lines. Biometrika, 42(1/2):243-257. · Zbl 0067.37204
[59] Ross, G. J. (2014). Sequential change detection in the presence of unknown parameters. Statistics and Computing, 24(6):1017-1030. · Zbl 1332.62269
[60] Schröter, K., Mühr, B., Elmer, F., Kunz-Plapp, T., and Trieselmann, W. (2013). June 2013 flood in central Europe - focus Germany. CEDIM Forensic Disaster Analysis Group (FDA). https://www.cedim.kit.edu/2850.php.
[61] Serfling, R. J. (2009). Approximation Theorems of Mathematical Statistics. Wiley Series in Probability and Statistics. John Wiley & Sons, Inc., New York.
[62] Sharipov, O., Tewes, J., and Wendler, M. (2016). Sequential block bootstrap in a Hilbert space with application to change point analysis. The Canadian Journal of Statistics / La Revue Canadienne de Statistique, 44(3):300-322. · Zbl 1357.62187
[63] Soh, Y. S. and Chandrasekaran, V. (2017). High-dimensional change-point estimation: Combining filtering with convex optimization. Applied and Computational Harmonic Analysis, 43(1):122-147. · Zbl 1366.62182
[64] Steland, A. (2006). A bootstrap view on Dickey-Fuller control charts for AR(1) series. Austrian Journal of Statistics, 35:339-346.
[65] Tartakovsky, A., Nikiforov, I. V., and Basseville, M. (2014). Sequential Analysis: Hypothesis Testing and Changepoint Detection, volume 136 of Chapman & Hall/CRC Monographs on Statistics & Applied Probability. Chapman & Hall/CRC, Taylor and Francis Group. · Zbl 1341.62026
[66] Tartakovsky, A. G., Rozovskii, B. L., Blazek, R. B., and Hongjoong Kim (2006). A novel approach to detection of intrusions in computer networks via adaptive sequential and batch-sequential change-point detection methods. IEEE Transactions on Signal Processing, 54(9):3372-3382. · Zbl 1373.68144
[67] van der Vaart, A. W. (1998). Asymptotic Statistics. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, Cambridge. · Zbl 0910.62001
[68] Wald, A. (1945). Sequential tests of statistical hypotheses. Annals of Mathematical Statistics, 16(2):117-186. · Zbl 0060.30207
[69] Wang, D., Yu, Y., and Rinaldo, A. (2017). Optimal covariance change point localization in high dimension. arXiv e-print arXiv:1712.09912v1.
[70] Wang, D., Zhao, Z., Lin, K. Z., and Willett, R. (2021). Statistically and computationally efficient change point localization in regression settings. J. Mach. Learn. Res., 22:Paper No. [248], 46. · Zbl 07626763
[71] Wang, R. and Shao, X. (2020). Dating the break in high-dimensional data. arXiv e-print arXiv:2002.04115v1.
[72] Wang, R., Volgushev, S., and Shao, X. (2019). Inference for change points in high dimensional data. arXiv e-print arXiv:1905.08446v1. · Zbl 1486.62246
[73] Wang, T. and Samworth, R. J. (2018). High dimensional change point estimation via sparse projection. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 80(1):57-83. · Zbl 1439.62199
[74] Wied, D. and Galeano, P. (2013). Monitoring correlation change in a sequence of random variables. Journal of Statistical Planning and Inference, 143(1):186-196. · Zbl 1251.62025
[75] Woodall, W. H. and Montgomery, D. C. (1999). Research issues and ideas in statistical process control. Journal of Quality Technology, 31(4):376-386.
[76] Wu, W. B. (2005). Nonlinear system theory: Another look at dependence. Proceedings of the National Academy of Sciences of the United States of America, 102(40):14150-14154. · Zbl 1135.62075
[77] Wu, W. B. and Zhou, Z. (2011). Gaussian approximations for non-stationary multiple time series. Statistica Sinica, 21(3):1397-1413. · Zbl 1251.60029
[78] Xie, Y. and Siegmund, D. (2013). Sequential multi-sensor change-point detection. The Annals of Statistics, 41(2):670-692. · Zbl 1267.62084
[79] Yu, Y., Padilla, O. H. M., Wang, D., and Rinaldo, A. (2020). A note on online change point detection. arXiv e-print arXiv:2006.03283v1.
[80] Zeileis, A. (2004). Econometric computing with HC and HAC covariance matrix estimators. Journal of Statistical Software, 11(10):1-17.
[81] Zhang, X. and Cheng, G. (2018). Gaussian approximation for high dimensional vector under physical dependence. Bernoulli, 24(4A):2640-2675. · Zbl 1419.62257
[82] Zou, C., Wang, Z., Zi, X., and Jiang, W. (2015). An efficient online monitoring method for high-dimensional data streams. Technometrics, 57(3):374-387.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.