##
**Minimal \(\sigma\)-field for flexible sufficient dimension reduction.**
*(English)*
Zbl 1493.62017

Summary: Sufficient Dimension Reduction (SDR) becomes an important tool for mitigating the curse of dimensionality in high dimensional regression analysis. Recently, Flexible SDR (FSDR) has been proposed to extend SDR by finding lower dimensional projections of transformed explanatory variables. The dimensions of the projections however cannot fully represent the extent of data reduction FSDR can achieve. As a consequence, optimality and other theoretical properties of FSDR are currently not well understood. In this article, we propose to use the \(\sigma\)-field associated with the projections, together with their dimensions to fully characterize FSDR, and refer to the \(\sigma\)-field as the FSDR \(\sigma\)-field. We further introduce the concept of minimal FSDR \(\sigma\)-field and consider FSDR projections with the minimal \(\sigma\)-field optimal. Under some mild conditions, we show that the minimal FSDR \(\sigma\)-field exists, attaining the lowest dimensionality at the same time. To estimate the minimal FSDR \(\sigma\)-field, we propose a two-stage procedure called the Generalized Kernel Dimension Reduction (GKDR) method and partially establish its consistency property under weak conditions. Extensive simulation experiments demonstrate that the GKDR method can effectively find the minimal FSDR \(\sigma\)-field and outperform other existing methods. The application of GKDR to a real life air pollution data set sheds new light on the connections between atmospheric conditions and air quality.

### Keywords:

high dimensional regression analysis; univariate transformation; sufficient predictor; reproducing kernel Hilbert space; conditional entropy
PDFBibTeX
XMLCite

\textit{H. Guo} et al., Electron. J. Stat. 16, No. 1, 1997--2032 (2022; Zbl 1493.62017)

### References:

[1] | BILLINGSLEY, P. (2008). Probability and measure. John Wiley & Sons. |

[2] | BREIMAN, L. and FRIEDMAN, J. H. (1985). Estimating optimal transformations for multiple regression and correlation. Journal of the American statistical Association 80 580-598. · Zbl 0594.62044 |

[3] | CHENG, Y., HE, K.-B., DU, Z.-Y., ZHENG, M., DUAN, F.-K. and MA, Y.-L. (2015). Humidity plays an important role in the PM2. 5 pollution in Beijing. Environmental pollution 197 68-75. |

[4] | CHIAROMONTE, F. and COOK, R. D. (2002). Sufficient dimension reduction and graphics in regression. Annals of the Institute of Statistical Mathematics 54 768-795. · Zbl 1047.62066 |

[5] | COOK, R. D. and WEISBERG, S. (1991). Sliced inverse regression for dimension reduction: Comment. Journal of the American Statistical Association 86 328-332. · Zbl 1353.62037 |

[6] | COVER, T. M. and THOMAS, J. A. (2012). Elements of information theory. John Wiley & Sons. |

[7] | DE BOOR, C., DE BOOR, C., MATHÉMATICIEN, E.-U., DE BOOR, C. and DE BOOR, C. (1978). A practical guide to splines 27. Springer-Verlag New York. · Zbl 0406.41003 |

[8] | FAN, J. and YAO, Q. (1998). Efficient estimation of conditional variance functions in stochastic regression. Biometrika 85 645-660. · Zbl 0918.62065 |

[9] | FINE, S. and SCHEINBERG, K. (2001). Efficient SVM training using low-rank kernel representations. Journal of Machine Learning Research 2 243-264. · Zbl 1037.68112 |

[10] | FUKUMIZU, K., BACH, F. R., JORDAN, M. I. et al. (2009). Kernel dimension reduction in regression. The Annals of Statistics 37 1871-1905. · Zbl 1168.62049 |

[11] | FUKUMIZU, K. and LENG, C. (2014). Gradient-Based Kernel Dimension Reduction for Regression. Journal of the American Statistical Association 109 359-370. · Zbl 1367.62118 |

[12] | Hastie, T. J. and Tibshirani, R. J. (1990). Generalized additive models 43. CRC press. · Zbl 0747.62061 |

[13] | HUANG, K., ZHUANG, G., WANG, Q., FU, J., LIN, Y., LIU, T., HAN, L. and DENG, C. (2014). Extreme haze pollution in Beijing during January 2013: chemical characteristics, formation mechanism and role of fog processing. Atmospheric Chemistry and Physics Discussions 14 7517-7556. |

[14] | JIA, Y., RAHN, K. A., HE, K., WEN, T. and WANG, Y. (2008). A novel technique for quantifying the regional component of urban aerosol solely from its sawtooth cycles. Journal of Geophysical Research: Atmospheres 113. |

[15] | KOBAYASHI, S. and NOMIZU, K. (1963). Foundations of differential geometry 1. Interscience publishers New York. · Zbl 0119.37502 |

[16] | LEE, K.-Y., LI, B., CHIAROMONTE, F. et al. (2013). A general theory for nonlinear sufficient dimension reduction: Formulation and estimation. The Annals of Statistics 41 221-249. · Zbl 1347.62018 |

[17] | LI, K.-C. (1991). Sliced inverse regression for dimension reduction. Journal of the American Statistical Association 86 316-327. · Zbl 0742.62044 |

[18] | LIAN, H. and WANG, Q. (2016). Kernel additive sliced inverse regression. Statistica Sinica 527-546. · Zbl 1360.62181 |

[19] | LIANG, X., LI, S., ZHANG, S., HUANG, H. and CHEN, S. X. (2016). PM2. 5 data reliability, consistency, and air quality assessment in five Chinese cities. Journal of Geophysical Research: Atmospheres 121. |

[20] | ROBERT, P. and ESCOUFIER, Y. (1976). A Unifying Tool for Linear Multivariate Statistical Methods: The RV-Coefficient. Journal of the Royal Statistical Society 25 257-265. |

[21] | Van der Vaart, A. W. (2000). Asymptotic statistics 3. Cambridge university press. · Zbl 0943.62002 |

[22] | WANG, L. and YANG, L. (2009). Spline estimation of single-index models. Statistica Sinica 19 765. · Zbl 1166.62023 |

[23] | WANG, T. and ZHU, L. (2018). Flexible dimension reduction in regression. Statistica Sinica 28 1009-1029. · Zbl 1390.62126 |

[24] | WEN, Z. and YIN, W. (2013). A feasible method for optimization with orthogonality constraints. Mathematical Programming 142 397-434. · Zbl 1281.49030 |

[25] | WILLIAMS, C. K. and SEEGER, M. (2001). Using the Nyström method to speed up kernel machines. In Advances in neural information processing systems 682-688. |

[26] | WU, H. M. (2008). Kernel Sliced Inverse Regression with Applications to Classification. Journal of Computational and Graphical Statistics 17 590-610. |

[27] | XIA, Y., TONG, H., LI, W. and ZHU, L.-X. (2002). An adaptive estimation of dimension reduction space. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 64 363-410. · Zbl 1091.62028 |

[28] | ZHANG, R., WANG, G., GUO, S., ZAMORA, M. L., YING, Q., LIN, Y., WANG, W., HU, M. and WANG, Y. (2015). Formation of urban fine particulate matter. Chemical reviews 115 3803-3855. |

[29] | ZHU, Y. and ZENG, P. (2006). Fourier methods for estimating the central subspace and the central mean subspace in regression. Journal of the American Statistical Association 101 1638-1651. · Zbl 1171.62325 |

This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.