×

Modeling binary time series using Gaussian processes with application to predicting sleep states. (English) Zbl 1422.62219

Summary: Motivated by the problem of predicting sleep states, we develop a mixed effects model for binary time series with a stochastic component represented by a Gaussian process. The fixed component captures the effects of covariates on the binary-valued response. The Gaussian process captures the residual variations in the binary response that are not explained by covariates and past realizations. We develop a frequentist modeling framework that provides efficient inference and more accurate predictions. Results demonstrate the advantages of improved prediction rates over existing approaches such as logistic regression, generalized additive mixed model, models for ordinal data, gradient boosting, decision tree and random forest. Using our proposed model, we show that previous sleep state and heart rates are significant predictors for future sleep states. Simulation studies also show that our proposed method is promising and robust. To handle computational complexity, we utilize Laplace approximation, golden section search and successive parabolic interpolation. With this paper, we also submit an R-package (HIBITS) that implements the proposed procedure.

MSC:

62H30 Classification and discrimination; cluster analysis (statistical aspects)
62M10 Time series, auto-correlation, regression, etc. in statistics (GARCH)
60G15 Gaussian processes
62M20 Inference from stochastic processes and prediction
62P10 Applications of statistics to biology and medical sciences; meta analysis

Software:

HIBITS; spBayes
PDFBibTeX XMLCite
Full Text: DOI arXiv Link

References:

[1] BANERJEE, S., CARLIN, B.P., and GELFAND, A.E. (2014), Hierarchical Modeling and Analysis for Spatial Data, CRC Press. · Zbl 1358.62009
[2] BANERJEE, S., GELFAND, A.E., FINLEY, A.O., and SANG, H. (2008), “Gaussian Predictive Process Models for Large Spatial Data Sets”, Journal of the Royal Statistical Society: Series B (Statistical Methodology), 70(4), 825-848. · Zbl 05563371
[3] Benbadis, Selim R., Introduction to Sleep Electroencephalography, 989-1024 (2005), Hoboken, NJ, USA
[4] BONNEY, G.E. (1987), “Logistic Regression for Dependent Binary Observations”, Biometrics, 45, 951-973. · Zbl 0707.62153
[5] BRILLINGER, D.R. (1983), “A Generalized Linear Model with Gaussian Regressor Variables”, in A Festschrift for Erich L. Lehmann, Pacific Grove, CA:Wadsworth, pp. 97-114. · Zbl 0519.62050
[6] CAIADO, J., CRATO, N., and PEÑA, D. (2006), “A Periodogram-Based Metric for Time Series Classification”, Computational Statistics and Data Analysis 50(10), 2668-2684. · Zbl 1445.62222
[7] CORNFORD, D. (1998), “Non-Zero Mean Gaussian Process Prior Wind Field Models”, Technical Report, Aston University, Birmingham.
[8] FOKIANOS, K., and KEDEM, B. (1998), “Prediction and Classification of Non-Stationary Categorical Time Series”, Journal of Multivariate Analysis, 67(2), 277-296. · Zbl 0919.62105
[9] FOKIANOS, K., and KEDEM, B. (2002), Regression Model for Time Series Analysis, Wiley Interscience. · Zbl 1011.62089
[10] FOKIANOS, K., and KEDEM, B. (2003), “Regression Theory for Categorical Time Series”, Statistical Science, 18(3), 357-376. · Zbl 1055.62095
[11] FRIEDMAN, J., HASTIE, T., and TIBSHIRANI, R. (2001), The Elements of Statistical Learning (Vol. 1), Springer Series in Statistics, Berlin: Springer. · Zbl 0973.62007
[12] FRIEDMAN, J.H. (2001), “Greedy Function Approximation: A Gradient Boosting Machine”, Annals of Statistics, 29(5), 1189-1232. · Zbl 1043.62034
[13] GELFAND, A.E., KOTTAS, A., and MACEACHERN, S.N. (2005), “Bayesian Nonparametric Spatial Modeling with Dirichlet Process Mixing”, Journal of the American Statistical Association 100(471), 1021-1035. · Zbl 1117.62342
[14] JACOBS, P.A., and LEWIS, P.A. (1978), “Discrete Time Series Generated by Mixtures II: Asymptotic Properties”, Journal of the Royal Statistical Society. Series B (Methodological), 40(2), 222-228. · Zbl 0388.62086
[15] KEENAN, D.M. (1982), “A Time Series Analysis of Binary Data”, Journal of the American Statistical Association 77(380), 816-821. · Zbl 0507.62079
[16] KUSS, M. (2006), “Gaussian Process Models for Robust Regression, Classification, and Reinforcement Learning”, Ph. D. thesis, Technische Universität Darmstadt.
[17] LIN, X., and ZHANG, D. (1999), “Inference in Generalized Additive Mixed Models by Using Smoothing Splines”, Journal of the Royal Statistical Society: Series B (Statistical Methodology) 61(2), 381-400. · Zbl 0915.62062
[18] LINDQUIST, M.A., and MCKEAGUE, I. (2009), “Logistic Regression with Brownian-Like Predictors”, Journal of the American Statistical Association 104, 1575-1585. · Zbl 1205.62125
[19] MAHARAJ, E.A. (2002), “Comparison of Non-Stationary Time Series in the Frequency Domain”, Computational Statistics and Data Analysis 40(1), 131-141. · Zbl 0990.62078
[20] MAHARAJ, E.A., D’URSO, P., and GALAGEDERA, D.U. (2010), “Wavelet-Based Fuzzy Clustering of Time Series”, Journal of Classification 27(2), 231-275. · Zbl 1337.62307
[21] MCCULLAGH, P. (1984), “Generalized Linear Models”, European Journal of Operational Research 16(3), 285-292. · Zbl 0556.62041
[22] MEYN, S.P., and Tweedie, R.L. (2012), Markov Chains and Stochastic Stability, Springer Science and Business Media. · Zbl 0925.60001
[23] MINKA, T.P. (2001), “A Family of Algorithms for Approximate Bayesian Inference”, Ph. D. thesis, Massachusetts Institute of Technology.
[24] NEVSIMALOVA, S., and SONKA, K. (1997), “Poruchy Spanku a Bdeni”, Maxdorf/ Jessenius, Parha.
[25] OPPER, M., and WINTHER, O. (2000), “Gaussian Processes for Classification: Mean Field Algorithms”, Neural Computation, 12(11), 2655-2684.
[26] QUICK, H., BANERJEE, S., CARLIN, B.P. et al. (2013), “Modeling Temporal Gradients in Regionally Aggregated California Asthma Hospitalization Data”, The Annals of Applied Statistics 7(1), 154-176. · Zbl 1454.62382
[27] SNELSON, E., RASMUSSEN, C.E., and GHAHRAMANI, Z. (2004), “Warped Gaussian Processes”, Advances in Neural Information Processing Systems 16, 337-344.
[28] STEIN, M.L. (2012), Interpolation of Spatial Data: Some Theory for Kriging, Springer Science and Business Media.
[29] VANDENBERG-RODES, A., and SHAHBABA, B. (2015), “Dependent Matern Processes for Multivariate Time Series”, arXiv preprint arXiv:1502.03466.
[30] WANG, F., and GELFAND, A.E.(2014), “Modeling Space and Space-Time Directional Data Using Projected Gaussian Processes”, Journal of the American Statistical Association 109(508), 1565-1580. · Zbl 1368.62304
[31] WILLIAMS, C.K., and Barber, D. (1998), “Bayesian Classification with Gaussian Processes” IEEE Transactions on Pattern Analysis and Machine Intelligence, 20,(12), 1342-1351.
[32] WILLIAMS, C.K., and RASMUSSEN, C.E. (2006), “Gaussian Processes for Machine Learning”, The MIT Press 2(3), 4. · Zbl 1177.68165
[33] ZHOU, B., MOORMAN, D.E., BEHSETA, S., OMBAO, H., and SHAHBABA, B. (2015), “A Dynamic Bayesian Model for Characterizing Cross-Neuronal Interactions During Decision Making”, Journal of the American Statistical Association 111, 1-44.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.