×

zbMATH — the first resource for mathematics

Baseline drift estimation for air quality data using quantile trend filtering. (English) Zbl 1446.62115
Summary: We address the problem of estimating smoothly varying baseline trends in time series data. This problem arises in a wide range of fields, including chemistry, macroeconomics and medicine; however, our study is motivated by the analysis of data from low cost air quality sensors. Our methods extend the quantile trend filtering framework to enable the estimation of multiple quantile trends simultaneously while ensuring that the quantiles do not cross. To handle the computational challenge posed by very long time series, we propose a parallelizable alternating direction method of multipliers (ADMM) algorithm. The ADMM algorthim enables the estimation of trends in a piecewise manner, both reducing the computation time and extending the limits of the method to larger data sizes. We also address smoothing parameter selection and propose a modified criterion based on the extended Bayesian information criterion. Through simulation studies and our motivating application to low cost air quality sensor data, we demonstrate that our model provides better quantile trend estimates than existing methods and improves signal classification of low-cost air quality sensor output.
MSC:
62G08 Nonparametric regression and quantile regression
62M20 Inference from stochastic processes and prediction
62P12 Applications of statistics to environmental and related topics
PDF BibTeX XML Cite
Full Text: DOI Euclid
References:
[1] Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J. et al. (2011). Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends Mach. Learn. 3 1-122. · Zbl 1229.90122
[2] Brantley, H. L., Guinness, J. and Chi, E. C. (2020). Supplement to “Baseline drift estimation for air quality data using quantile trend filtering.” https://doi.org/10.1214/19-AOAS1318SUPPA, https://doi.org/10.1214/19-AOAS1318SUPPB.
[3] Brantley, H., Hagler, G., Kimbrough, E., Williams, R., Mukerjee, S. and Neas, L. (2014). Mobile air monitoring data-processing strategies and effects on spatial air pollution trends. Atmos. Meas. Tech. 7 2169-2183.
[4] Chen, J. and Chen, Z. (2008). Extended Bayesian information criteria for model selection with large model spaces. Biometrika 95 759-771. · Zbl 1437.62415
[5] Coombes, K. R., Baggerly, K. A. and Morris, J. S. (2007). Pre-processing mass spectrometry data. In Fundamentals of Data Mining in Genomics and Proteomics 79-102. Springer, Berlin.
[6] Du, P., Kibbe, W. A. and Lin, S. M. (2006). Improved peak detection in mass spectrum by incorporating continuous wavelet transform-based pattern matching. Bioinformatics 22 2059-2065.
[7] Faulkner, J. R. and Minin, V. N. (2018). Locally adaptive smoothing with Markov random fields and shrinkage priors. Bayesian Anal. 13 225-252. · Zbl 06873725
[8] Franco, M. A. d. M., Milori, D. M. B. P. and Boas, P. R. V. (2018). Comparison of algorithms for baseline correction of LIBS spectra for quantifying total carbon in Brazilian soils. ArXiv Preprint. Available at arXiv:1805.03695.
[9] Gabay, D. and Mercier, B. (1975). A Dual Algorithm for the Solution of Non Linear Variational Problems via Finite Element Approximation. Institut de recherche d’informatique et d’automatique. · Zbl 0352.65034
[10] Glowinski, R. and Marrocco, A. (1975). Sur l’approximation, par éléments finis d’ordre un, et la résolution, par pénalisation-dualité, d’une classe de problèmes de Dirichlet non linéaires. Rev. Française Automat. Informat. Recherche Opérationnelle Sér. Rouge Anal. Numér. 9 41-76. · Zbl 0368.65053
[11] Gurobi Optimization, LLC (2018). Gurobi Optimizer Reference Manual.
[12] Kim, S.-J., Koh, K., Boyd, S. and Gorinevsky, D. (2009). \(l_1\) trend filtering. SIAM Rev. 51 339-360. · Zbl 1171.37033
[13] Koenker, R. (2018). quantreg: Quantile Regression. R package version 5.36.
[14] Koenker, R. and Bassett, G. Jr. (1978). Regression quantiles. Econometrica 46 33-50. · Zbl 0373.62038
[15] Koenker, R., Ng, P. and Portnoy, S. (1994). Quantile smoothing splines. Biometrika 81 673-680. · Zbl 0810.62040
[16] Luo, Y., Hargraves, R. H., Belle, A., Bai, O., Qi, X., Ward, K. R., Pfaffenberger, M. P. and Najarian, K. (2013). A hierarchical method for removal of baseline drift from biomedical signals: Application in ECG analysis. Sci. World J. 2013.
[17] Marandi, R. Z. and Sabzpoushan, S. (2015). Qualitative modeling of the decision-making process using electrooculography. Behav. Res. Methods 47 1404-1412.
[18] Mecozzi, M. (2014). A polynomial curve fitting method for baseline drift correction in the chromatographic analysis of hydrocarbons in environmental samples. APCBEE Proc. 10 2-6.
[19] Ning, X., Selesnick, I. W. and Duval, L. (2014). Chromatogram baseline estimation and denoising using sparsity (BEADS). Chemom. Intell. Lab. Syst. 139 156-167.
[20] Nychka, D., Gray, G., Haaland, P., Martin, D. and O’connell, M. (1995). A nonparametric regression approach to syringe grading for quality improvement. J. Amer. Statist. Assoc. 90 1171-1178. · Zbl 0864.62066
[21] Nychka, D., Furrer, R., Paige, J. and Sain, S. (2017). fields: Tools for Spatial Data. R package version 9.6.
[22] Oh, H.-S., Lee, T. C. M. and Nychka, D. W. (2011). Fast nonparametric quantile regression with arbitrary smoothing methods. J. Comput. Graph. Statist. 20 510-526.
[23] Oh, H.-S., Nychka, D., Brown, T. and Charbonneau, P. (2004). Period analysis of variable stars by robust smoothing. J. Roy. Statist. Soc. Ser. C 53 15-30. · Zbl 1111.85302
[24] Pettersson, K., Jagadeesan, S., Lukander, K., Henelius, A., Hæggström, E. and Müller, K. (2013). Algorithm for automatic analysis of electro-oculographic data. Biomed. Eng. Online 12 110.
[25] Racine, J. S. and Li, K. (2017). Nonparametric conditional quantile estimation: A locally weighted quantile kernel approach. J. Econometrics 201 72-94. · Zbl 1391.62061
[26] Rudin, L. I., Osher, S. and Fatemi, E. (1992). Nonlinear total variation based noise removal algorithms Phys. D, Nonlinear Phenom. 60 259-268. · Zbl 0780.49028
[27] Snyder, E., Watkins, T., Solomon, P., Thoma, E., Williams, R., Hagler, G., Shelow, D., Hindin, D., Kilaru, V. et al. (2013). The changing paradigm of air pollution monitoring Environ. Sci. Technol. 47 11369.
[28] Theussl, S. and Hornik, K. (2017). Rglpk: R/GNU Linear Programming Kit Interface. R package version 0.6-3.
[29] Thoma, E. D., Brantley, H. L., Oliver, K. D., Whitaker, D. A., Mukerjee, S., Mitchell, B., Wu, T., Squier, B., Escobar, E. et al. (2016). South Philadelphia passive sampler and sensor study. J. Air Waste Manage. Assoc. 66 959-970.
[30] Tibshirani, R. J. (2014). Adaptive piecewise polynomial estimation via trend filtering. Ann. Statist. 42 285-323. · Zbl 1307.62118
[31] Tibshirani, R., Saunders, M., Rosset, S., Zhu, J. and Knight, K. (2005). Sparsity and smoothness via the fused lasso. J. R. Stat. Soc. Ser. B. Stat. Methodol. 67 91-108. · Zbl 1060.62049
[32] Weingessel, A. and Turlach, B. A. (2013). quadprog: Functions to solve quadratic programming problems. R package version 1.5-5.
[33] Wickham, H. (2016). ggplot2: Elegant Graphics for Data Analysis. Springer, New York. · Zbl 1397.62006
[34] Yamada, H. (2017). Estimating the trend in US real GDP using the \(\ell_1\) trend filtering. Appl. Econ. Lett. 24 713-716.
[35] Yu, K. · Zbl 0983.62017
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.