Dynamical functional prediction and classification, with application to traffic flow prediction. (English) Zbl 1257.62090

Summary: Motivated by the need for accurate traffic flow prediction in transportation management, we propose a functional data method to analyze traffic flow patterns and predict future traffic flow. We approach the problem by sampling traffic flow trajectories from a mixture of stochastic processes. The proposed functional mixture prediction approach combines functional prediction with probabilistic functional classification to take distinct traffic flow patterns into account. The probabilistic classification procedure, which incorporates functional clustering and discrimination, hinges on subspace projection. The proposed methods not only assist in predicting traffic flow trajectories, but also identify distinct patterns in daily traffic flow of typical temporal trends and variabilities. The proposed methodology is widely applicable in analysis and prediction of longitudinally recorded functional data.


62M20 Inference from stochastic processes and prediction
90B06 Transportation, logistics and supply chain management
62H30 Classification and discrimination; cluster analysis (statistical aspects)
90B20 Traffic problems in operations research
62M99 Inference from stochastic processes


fda (R)
Full Text: DOI arXiv Euclid


[1] Abraham, C., Cornillon, P. A., Matzner-Løber, E. and Molinari, N. (2003). Unsupervised curve clustering using B-splines. Scand. J. Stat. 30 581-595. · Zbl 1039.91067 · doi:10.1111/1467-9469.00350
[2] Antoch, J., Prchal, L., De Rosa, M. R. and Sarda, P. (2010). Electricity consumption prediction with functional linear regression using spline estimators. J. Appl. Stat. 37 2027-2041. · doi:10.1080/02664760903214395
[3] Aston, J. A. D., Chiou, J.-M. and Evans, J. P. (2010). Linguistic pitch analysis using functional principal component mixed effect models. J. R. Stat. Soc. Ser. C Appl. Stat. 59 297-317. · doi:10.1111/j.1467-9876.2009.00689.x
[4] Bishop, C. M. and Lasserre, J. (2007). Generative or discriminative? Getting the best of both worlds. In Bayesian Statistics 8 3-24. Oxford Univ. Press, Oxford. · Zbl 1252.62063
[5] Çetiner, B. G., Sari, M. and Borat, O. (2010). A neural network based traffic-flow prediction model. Math. and Comput. Appl 15 269-278.
[6] Chen, H. and Grant-Muller, S. (2001). Use of sequential learning for short-term traffic flow forecasting. Transportation Research Part C : Emerging Technologies 9 319-336.
[7] Chiou, J.-M. (2012). Supplement to “Dynamical functional prediction and classification, with application to traffic flow prediction.” . · Zbl 1257.62090
[8] Chiou, J.-M. and Li, P.-L. (2007). Functional clustering and identifying substructures of longitudinal data. J. R. Stat. Soc. Ser. B Stat. Methodol. 69 679-699. · doi:10.1111/j.1467-9868.2007.00605.x
[9] Chiou, J.-M. and Li, P.-L. (2008). Correlation-based functional clustering via subspace projection. J. Amer. Statist. Assoc. 103 1684-1692. · Zbl 1286.62058 · doi:10.1198/016214508000000814
[10] Chiou, J.-M., Müller, H.-G. and Wang, J.-L. (2003). Functional quasi-likelihood regression models with smooth random effects. J. R. Stat. Soc. Ser. B Stat. Methodol. 65 405-423. · Zbl 1065.62065 · doi:10.1111/1467-9868.00393
[11] Chiou, J.-M. and Müller, H.-G. (2007). Diagnostics for functional regression via residual processes. Comput. Statist. Data Anal. 51 4849-4863. · Zbl 1162.62394 · doi:10.1016/j.csda.2006.07.042
[12] Chiou, J.-M. and Müller, H.-G. (2009). Modeling hazard rates as functional data for the analysis of cohort lifetables and mortality forecasting. J. Amer. Statist. Assoc. 104 572-585. · doi:10.1198/jasa.2009.0023
[13] Coffey, N. and Hinde, J. (2011). Analyzing time-course microarray data using functional data analysis-a review. Stat. Appl. Genet. Mol. Biol. 10 Art. 23, 34. · Zbl 1296.92025 · doi:10.2202/1544-6115.1671
[14] Cuevas, A., Febrero, M. and Fraiman, R. (2007). Robust estimation and classification for functional data via projection-based depth notions. Comput. Statist. 22 481-496. · Zbl 1195.62032 · doi:10.1007/s00180-007-0053-0
[15] D’Amato, V., Piscopo, G. and Russolillo, M. (2011). The mortality of the Italian population: Smoothing techniques on the Lee-Carter model. Ann. Appl. Stat. 5 705-724. · Zbl 1223.62171 · doi:10.1214/10-AOAS394
[16] Dawid, A. P. (1976). Properties of diagnostic data distributions. Biometrics 32 647-658. · Zbl 0332.62078 · doi:10.2307/2529753
[17] Di, C.-Z., Crainiceanu, C. M., Caffo, B. S. and Punjabi, N. M. (2009). Multilevel functional principal component analysis. Ann. Appl. Stat. 3 458-488. · Zbl 1160.62061 · doi:10.1214/08-AOAS206
[18] Fan, J. and Gijbels, I. (1996). Local Polynomial Modelling and Its Applications. Monographs on Statistics and Applied Probability 66 . Chapman & Hall, London. · Zbl 0873.62037
[19] Ferraty, F. and Vieu, P. (2006). Nonparametric Functional Data Analysis : Theory and Practice . Springer, New York. · Zbl 1119.62046 · doi:10.1007/0-387-36620-2
[20] Gao, H. O. and Niemeier, D. A. (2008). Using functional data analysis of diurnal ozone and NOx cycles to inform transportation emissions control. Transportation Research Part D : Transport and Environment 13 221-238.
[21] Gneiting, T. (2008). Editorial: Probabilistic forecasting. J. Roy. Statist. Soc. Ser. A 171 319-321. · doi:10.1111/j.1467-985X.2007.00522.x
[22] Hyndman, R. J. and Shahid Ullah, M. (2007). Robust forecasting of mortality and fertility rates: A functional data approach. Comput. Statist. Data Anal. 51 4942-4956. · Zbl 1162.62434 · doi:10.1016/j.csda.2006.07.028
[23] James, G. M. and Sugar, C. A. (2003). Clustering for sparsely sampled functional data. J. Amer. Statist. Assoc. 98 397-408. · Zbl 1041.62052 · doi:10.1198/016214503000189
[24] Kamarianakis, Y., Shen, W. and Wynter, L. (2012). Real-time road traffic forecasting using regime-switching space-time models and adaptive LASSO (with discussion). Appl. Stoch. Models Bus. Ind. 28 297-323. · doi:10.1002/asmb.1937
[25] Kirby, H. R., Waston, S. M. and Dougherty, M. S. (1997). Should we use neural networks or statistical models for short-term motorway traffic forecasting? Int. J. Forecasting 13 43-50.
[26] Li, P.-L. and Chiou, J.-M. (2011). Identifying cluster number for subspace projected functional data clustering. Comput. Statist. Data Anal. 55 2090-2103. · Zbl 1328.62387
[27] López-Pintado, S. and Romo, J. (2006). Depth-based classification for functional data. In Data Depth : Robust Multivariate Analysis , Computational Geometry and Applications (R. Liu, R. Serfling and D. L. Souvaine, eds.). DIMACS Ser. Discrete Math. Theoret. Comput. Sci. 72 103-119. Amer. Math. Soc., Providence, RI.
[28] Ma, P. and Zhong, W. (2008). Penalized clustering of large-scale functional data with multiple covariates. J. Amer. Statist. Assoc. 103 625-636. · Zbl 1469.62288 · doi:10.1198/016214508000000247
[29] McCullagh, P. and Nelder, J. A. (1983). Generalized Linear Models . Chapman & Hall, London. · Zbl 0588.62104
[30] Müller, H.-G. (2005). Functional modelling and classification of longitudinal data. Scand. J. Stat. 32 223-246. · Zbl 1089.62072 · doi:10.1111/j.1467-9469.2005.00429.x
[31] Müller, H.-G. (2009). Functional modeling of longitudinal data. In Longitudinal Data Analysis (G. Fitzmaurice, M. Davidian, G. Verbeke and G. Molenberghs, eds.) 223-251. CRC Press, Boca Raton, FL.
[32] Müller, H.-G., Chiou, J.-M. and Leng, X. (2008). Inferring gene expression dynamics via functional regression analysis. BMC Bioinformatics 9 60.
[33] Müller, H.-G. and Zhang, Y. (2005). Time-varying functional regression for predicting remaining lifetime distributions from longitudinal trajectories. Biometrics 61 1064-1075. · Zbl 1087.62129 · doi:10.1111/j.1541-0420.2005.00378.x
[34] Ng, A. Y. and Jordan, M. I. (2002). On discriminative vs. generative classifier: A comparison of logistic regression and naive Bayes. Neural Information Processing System 2 841-848.
[35] Okutani, I. and Stephanides, Y. J. (1984). Dynamic prediction of traffic volume through Kalman filtering theory. Transportation Research Part B : Methodological 18 1-11.
[36] Ramsay, J. O. and Dalzell, C. J. (1991). Some tools for functional data analysis. J. Roy. Statist. Soc. Ser. B 53 539-572. · Zbl 0800.62314
[37] Ramsay, J. O. and Silverman, B. W. (2002). Applied Functional Data Analysis : Methods and Case Studies . Springer, New York. · Zbl 1011.62002 · doi:10.1007/b98886
[38] Ramsay, J. O. and Silverman, B. W. (2005). Functional Data Analysis , 2nd ed. Springer, New York. · Zbl 1079.62006
[39] Rice, J. A. (2004). Functional and longitudinal data analysis: Perspectives on smoothing. Statist. Sinica 14 631-647. · Zbl 1073.62033
[40] Sentürk, D. and Müller, H.-G. (2010). Functional varying coefficient models for longitudinal data. J. Amer. Statist. Assoc. 105 1256-1264. · Zbl 1390.62135 · doi:10.1198/jasa.2010.tm09228
[41] Serban, N. and Wasserman, L. (2005). CATS: Clustering after transformation and smoothing. J. Amer. Statist. Assoc. 100 990-999. · Zbl 1117.62422 · doi:10.1198/016214504000001574
[42] Smith, B. L. and Demetsky, M. J. (1997). Short-term traffic flow prediction: Neural network approach. Transportation Research Record : Journal of the Transportation Research Board 1453 98-104.
[43] Smith, B. L., William, B. M. and Oswald, R. K. (2002). Comparison of parametric and nonparametric models for traffic flow forecasting. Transportation Research Part C : Emerging Technologies 10 303-321.
[44] Stathopoulos, A. and Karlaftis, M. G. (2003). A multivariate state space approach for urban traffic flow modeling and prediction. Transportation Research Part C : Emerging Technologies 11 121-135.
[45] Sun, H., Liu, H. X., Xiao, H., He, R. R. and Ran, B. (2003). Use of local linear regression model for short-term traffic forecasting. Transportation Research Record 1836 143-150.
[46] Vlahogianni, E. I., Karlaftis, M. G. and Golias, J. C. (2008). Temporal evolution of short-term urban traffic flow: A nonlinear dynamic approach. Computer-Aided Civil and Infrastructure Engineering 22 326-334. · Zbl 1186.90031
[47] Williams, B. M. and Hoel, L. A. (2003). Modeling and forecasting vehicular traffic flow as a seasonal ARIMA process: Theoretical basis and empirical results. Journal of Transportation Engineering 129 664-672.
[48] Xie, Y., Zhang, Y. and Ye, Z. (2007). Short-term traffic volume forecasting using Kalman filter with discrete wavelet decomposition. Computer-Aided Civil and Infrastructure Engineering 22 326-334.
[49] Xue, J. H. and Titterington, D. M. (2008). Comment on “On discriminative vs. generative classifier: A comparison of logistic regression and naive Bayes.” Neural Process Letters 28 169-187.
[50] Yao, F., Müller, H.-G. and Wang, J.-L. (2005a). Functional data analysis for sparse longitudinal data. J. Amer. Statist. Assoc. 100 577-590. · Zbl 1117.62451 · doi:10.1198/016214504000001745
[51] Yao, F., Müller, H.-G. and Wang, J.-L. (2005b). Functional linear regression analysis for longitudinal data. Ann. Statist. 33 2873-2903. · Zbl 1084.62096 · doi:10.1214/009053605000000660
[52] Yin, H., Wong, S. C., Xu, J. and Wong, C. K. (2002). Urban traffic flow prediction using fuzzy-neural approach. Transportation Research Part C : Emerging Technologies 10 85-98.
[53] Zhang, Y. and Ye, Z. (2008). Short-term traffic flow forecasting using fuzzy logic system methods. Journal of Intelligent Transportation System 12 102-112. · Zbl 1171.90364 · doi:10.1080/15472450802262281
[54] Zhao, X., Marron, J. S. and Wells, M. T. (2004). The functional data analysis view of longitudinal data. Statist. Sinica 14 789-808. · Zbl 1073.62001
[55] Zheng, W., Lee, D. H. and Shi, Q. (2006). Short-term freeway traffic prediction: Bayesian combined neural network approach. Journal of Transportation Engineering 132 114-121.
[56] Zhou, R. R., Serban, N. and Gebraeel, N. (2011). Degradation modeling applied to residual lifetime prediction using functional data analysis. Ann. Appl. Stat. 5 1586-1610. · Zbl 1223.62156 · doi:10.1214/10-AOAS448
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.