Hoeltgebaum, Henrique; Adams, Niall; Lau, F. Din-Houn Unsupervised streaming anomaly detection for instrumented infrastructure. (English) Zbl 1478.62328 Ann. Appl. Stat. 15, No. 3, 1101-1125 (2021). Summary: Structural health monitoring (SHM) often involves instrumenting structures with distributed sensor networks. These networks typically provide high frequency data describing the spatiotemporal behaviour of the assets. A main objective of SHM is to reason about changes in structures’ behaviour using sensor data. We construct a streaming anomaly detection method for data from a railway bridge instrumented with a fibre-optic sensor network. The data exhibits trend over time, which may be partially attributable to environmental factors, calling for temporally adaptive estimation. Exploiting a latent structure present in the data motivates a quantity of interest for anomaly detection. This quantity is estimated, sequentially and adaptively, using a new formulation of streaming principal component analysis. Anomaly detection for this quantity is then provided using conformal prediction. Like all streaming methods, the proposed method has free control parameters which are set using simulations based on bridge data. Experiments demonstrate that this method can operate at the sampling frequency of the data while providing accurate tracking of the target quantity. Further, the anomaly detection is able to detect train passage events. Finally, the method reveals a previously unreported cyclic structure present in the data. MSC: 62P10 Applications of statistics to biology and medical sciences; meta analysis 62L20 Stochastic approximation Keywords:structural health monitoring; streaming PCA; stochastic gradient descent; adaptive estimation; conformal prediction Software:PredictiveRegression; conformalInference × Cite Format Result Cite Review PDF Full Text: DOI References: [1] Aggarwal, C. C. (2007). Data Streams: Models and Algorithms 31. Springer, Berlin. · Zbl 1126.68033 [2] Anagnostopoulos, C., Tasoulis, D. K., Adams, N. M., Pavlidis, N. G. and Hand, D. J. (2012). Online linear and quadratic discriminant analysis with adaptive forgetting for streaming classification. Stat. Anal. Data Min. 5 139-166. · Zbl 07260320 · doi:10.1002/sam.10151 [3] Balasubramanian, V., Ho, S.-S. and Vovk, V. (2014). Conformal Prediction for Reliable Machine Learning: Theory, Adaptations and Applications. Newnes, London. · Zbl 1290.68003 [4] Balzano, L., Chi, Y. and Lu, Y. M. (2018). Streaming PCA and subspace tracking: The missing data case. Proc. IEEE 106 1293-1310. [5] Benczúr, A. A., Kocsis, L. and Pálovics, R. (2018). Online machine learning in big data streams. Preprint. Available at arXiv:1802.05872. [6] Bodenham, D. A. and Adams, N. M. (2017). Continuous monitoring for changepoints in data streams using adaptive estimation. Stat. Comput. 27 1257-1270. · Zbl 1505.62072 · doi:10.1007/s11222-016-9684-8 [7] Boutsidis, C., Garber, D., Karnin, Z. and Liberty, E. (2015). Online principal components analysis. In Proceedings of the Twenty-Sixth Annual ACM-SIAM Symposium on Discrete Algorithms 887-901. SIAM, Philadelphia, PA. · Zbl 1373.62291 · doi:10.1137/1.9781611973730.61 [8] Bowers, K., Buscher, V., Dentten, R., Edwards, M., England, J., Enzer, M., Parlikad, A. K. and Schooling, J. (2016). Smart infrastructure: Getting more from strategic assets. Centre for Smart Infrastructure and Construction. [9] Burnaev, E. and Vovk, V. (2014). Efficiency of conformalized ridge regression. In Conference on Learning Theory 605-622. [10] Butler, L. J., Gibbons, N., He, P., Middleton, C. and Elshafie, M. Z. (2016a). Evaluating the early-age behaviour of full-scale prestressed concrete beams using distributed and discrete fibre optic sensors. Construction and Building Materials 126 894-912. [11] Butler, L. J., Gibbons, N., He, P., Middleton, C. and Elshafie, M. Z. (2016b). Evaluating the early-age behaviour of full-scale prestressed concrete beams using distributed and discrete fibre optic sensors. Construction and Building Materials 126 (Supplement C) 894-912. [12] Butler, L. J., Xu, J., He, P., Gibbons, N., Dirar, S., Middleton, C. R. and Elshafie, M. Z. (2018). Robust fibre optic sensor arrays for monitoring early-age performance of mass-produced concrete sleepers. Structural Health Monitoring 17 635-653. [13] Cardot, H. and Degras, D. (2018). Online principal component analysis in high dimension: Which algorithm to choose? Int. Stat. Rev. 86 29-50. · Zbl 07763574 · doi:10.1111/insr.12220 [14] Champ, C. W. and Woodall, W. H. (1987). Exact results for Shewhart control charts with supplementary runs rules Technometrics 29 393-399. · Zbl 0633.62100 [15] Chernozhukov, V., Wuthrich, K. and Zhu, Y. (2018). Exact and robust conformal inference methods for predictive machine learning with dependent data. Preprint. Available at arXiv:1802.06300. [16] Cleveland, W. S. (1979). Robust locally weighted regression and smoothing scatterplots. J. Amer. Statist. Assoc. 74 829-836. · Zbl 0423.62029 [17] Das, S., Saha, P. and Patro, S. (2016). Vibration-based damage detection techniques used for health monitoring of structures: A review. Journal of Civil Structural Health Monitoring 6 477-507. [18] Domingos, P. and Hulten, G. (2003). A general framework for mining massive data streams. J. Comput. Graph. Statist. 12 945-949. [19] Farrar, C. R. and Worden, K. (2006). An introduction to structural health monitoring. Philos. Trans. R. Soc. Lond. Ser. A Math. Phys. Eng. Sci. 365 303-315. [20] Gama, J. (2010). Knowledge Discovery from Data Streams. Chapman & Hall/CRC Data Mining and Knowledge Discovery Series. CRC Press, Boca Raton, FL. · Zbl 1230.68017 [21] Giraud, L., Langou, J. and Rozloznik, M. (2005). The loss of orthogonality in the Gram-Schmidt orthogonalization process. Comput. Math. Appl. 50 1069-1075. · Zbl 1085.65037 · doi:10.1016/j.camwa.2005.08.009 [22] Glisic, B., Inaudi, D., Lau, J. M., Mok, Y. C. and Ng, C. T. (2005). Long-term monitoring of high-rise buildings using long-gauge fibre optic sensors. In 7th International Conference on Multi-Purpose High-Rise Towers and Tall Buildings, Dubai, UAM, \(10-11 December \)(on Conference CD, Paper #0416). [23] Goodfellow, I., Bengio, Y. and Courville, A. (2016). Deep Learning. Adaptive Computation and Machine Learning. MIT Press, Cambridge, MA. · Zbl 1373.68009 [24] Haykin, S. S. (2008). Adaptive Filter Theory. Pearson, Upper Saddle River. [25] Hernandez-Garcia, M. R. and Masri, S. F. (2014). Application of statistical monitoring using latent-variable techniques for detection of faults in sensor networks. Journal of Intelligent Material Systems and Structures 25 121-136. [26] Johnstone, I. M. (2001). On the distribution of the largest eigenvalue in principal components analysis. Ann. Statist. 29 295-327. · Zbl 1016.62078 · doi:10.1214/aos/1009210544 [27] Jolliffe, I. T. (2011). Principal Component Analysis. Springer. [28] Kim, A. Y., Marzban, C., Percival, D. B. and Stuetzle, W. (2009). Using labeled data to evaluate change detectors in a multivariate streaming environment. Signal Process. 89 2529-2536. · Zbl 1197.94068 [29] Lau, F. D.-H., Adams, N. M., Girolami, M. A., Butler, L. J. and Elshafie, M. Z. E. B. (2018a). The role of statistics in data-centric engineering. Statist. Probab. Lett. 136 58-62. · Zbl 1463.62377 · doi:10.1016/j.spl.2018.02.035 [30] Lau, F. D. -H., Butler, L. J., Adams, N. M., Elshafie, M. Z. E. B. and Girolami, M. A. (2018b). Real-time statistical modelling of data generated from self-sensing bridges. Proceedings of the Institution of Civil Engineers—Smart Infrastructure and Construction 171 3-13. [31] Laxhammar, R. and Falkman, G. (2015). Inductive conformal anomaly detection for sequential detection of anomalous sub-trajectories. Ann. Math. Artif. Intell. 74 67-94. · Zbl 1331.68186 · doi:10.1007/s10472-013-9381-7 [32] Lei, J., Rinaldo, A. and Wasserman, L. (2015). A conformal prediction approach to explore functional data. Ann. Math. Artif. Intell. 74 29-43. · Zbl 1317.62039 · doi:10.1007/s10472-013-9366-6 [33] Lei, J., Robins, J. and Wasserman, L. (2013). Distribution-free prediction sets. J. Amer. Statist. Assoc. 108 278-287. · Zbl 06158342 · doi:10.1080/01621459.2012.751873 [34] Lei, J. and Wasserman, L. (2014). Distribution-free prediction bands for non-parametric regression. J. R. Stat. Soc. Ser. B. Stat. Methodol. 76 71-96. · Zbl 1411.62103 · doi:10.1111/rssb.12021 [35] Lei, J., G’Sell, M., Rinaldo, A., Tibshirani, R. J. and Wasserman, L. (2018). Distribution-free predictive inference for regression. J. Amer. Statist. Assoc. 113 1094-1111. · Zbl 1402.62155 · doi:10.1080/01621459.2017.1307116 [36] Measures, R. M., LeBlanc, M., Liu, K., Ferguson, S., Valis, T., Hogg, D., Turner, R. and McEwen, K. (1992). Fiber optic sensors for smart structures. Optics and Lasers in Engineering 16 127-152. [37] Mezzadri, F. (2007). How to generate random matrices from the classical compact groups. Notices Amer. Math. Soc. 54 592-604. · Zbl 1156.22004 [38] Micron Optics (2013). ENLIGHT User Guide. Available at http://www.micronoptics.com/download/enlight-user-guide-revision-1-138/#. Accessed: 2019-04-06. [39] Mitliagkas, I., Caramanis, C. and Jain, P. (2013). Memory limited, streaming PCA. In Advances in Neural Information Processing Systems 2886-2894. [40] Nadler, B. (2011). On the distribution of the ratio of the largest eigenvalue to the trace of a Wishart matrix. J. Multivariate Anal. 102 363-371. · Zbl 1327.62331 · doi:10.1016/j.jmva.2010.10.005 [41] Novembre, J. and Stephens, M. (2008). Interpreting principal component analyses of spatial population genetic variation. Nat. Genet. 40 646-649. [42] Oja, E. (1992). Principal components, minor components, and linear neural networks. Neural Netw. 5 927-935. [43] Oja, E. and Karhunen, J. (1985). On stochastic approximation of the eigenvectors and eigenvalues of the expectation of a random matrix. J. Math. Anal. Appl. 106 69-84. · Zbl 0583.62077 · doi:10.1016/0022-247X(85)90131-3 [44] Sanger, T. D. (1989). Optimal unsupervised learning in a single-layer linear feedforward neural network. Neural Netw. 2 459-473. [45] Scholz, M. (2007). Analysing periodic phenomena by circular PCA. In International Conference on Bioinformatics Research and Development 38-47. Springer, Berlin. [46] Stewart, G. W. (1980). The efficient generation of random orthogonal matrices with an application to condition estimators. SIAM J. Numer. Anal. 17 403-409. · Zbl 0443.65027 · doi:10.1137/0717034 [47] Todd, M. D., Nichols, J. M., Trickey, S. T., Seaver, M., Nichols, C. J. and Virgin, L. N. (2006). Bragg grating-based fibre optic sensors in structural health monitoring. Philos. Trans. R. Soc. Lond. Ser. A Math. Phys. Eng. Sci. 365 317-343. [48] Vovk, V. (2013). Conditional validity of inductive conformal predictors. Mach. Learn. 92 349-376. · Zbl 1273.68307 · doi:10.1007/s10994-013-5355-6 [49] Vovk, V., Gammerman, A. and Shafer, G. (2005). Conformal Prediction. Springer, Berlin. [50] Vovk, V., Nouretdinov, I. and Gammerman, A. (2009). On-line predictive linear regression. Ann. Statist. 37 1566-1590. · Zbl 1160.62065 · doi:10.1214/08-AOS622 [51] Warmuth, M. K. and Kuzmin, D. (2008). Randomized online PCA algorithms with regret bounds that are logarithmic in the dimension. J. Mach. Learn. Res. 9 2287-2320. · Zbl 1225.68273 [52] Weng, J., Zhang, Y. and Hwang, W.-S. (2003). Candid covariance-free incremental principal component analysis. IEEE Trans. Pattern Anal. Mach. Intell. 25 1034-1040 This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.