##
**Spike and slab variable selection: frequentist and Bayesian strategies.**
*(English)*
Zbl 1068.62079

Summary: Variable selection in the linear regression model takes many apparent faces from both frequentist and Bayesian standpoints. We introduce a variable selection method referred to as a rescaled spike and slab model. We study the importance of prior hierarchical specifications and draw connections to frequentist generalized ridge regression estimation. Specifically, we study the usefulness of continuous bimodal priors to model hypervariance parameters, and the effect scaling has on the posterior mean through its relationship to penalization.

Several model selection strategies, some frequentist and some Bayesian in nature, are developed and studied theoretically. We demonstrate the importance of selective shrinkage for effective variable selection in terms of risk misclassification, and show this is achieved using the posterior from a rescaled spike and slab model. We also show how to verify a procedure’s ability to reduce model uncertainty in finite samples using a specialized forward selection strategy. Using this tool, we illustrate the effectiveness of rescaled spike and slab models in reducing model uncertainty.

Several model selection strategies, some frequentist and some Bayesian in nature, are developed and studied theoretically. We demonstrate the importance of selective shrinkage for effective variable selection in terms of risk misclassification, and show this is achieved using the posterior from a rescaled spike and slab model. We also show how to verify a procedure’s ability to reduce model uncertainty in finite samples using a specialized forward selection strategy. Using this tool, we illustrate the effectiveness of rescaled spike and slab models in reducing model uncertainty.

### MSC:

62J07 | Ridge regression; shrinkage estimators (Lasso) |

62J05 | Linear regression; mixed models |

62P10 | Applications of statistics to biology and medical sciences; meta analysis |

62F15 | Bayesian inference |

### Keywords:

generalized ridge regression; model averaging; model uncertainty; ordinary least squares; penalization; rescaling; shrinkage; stochastic variable selection; Zcut; diabetes; hypervariance
PDF
BibTeX
XML
Cite

\textit{H. Ishwaran} and \textit{J. S. Rao}, Ann. Stat. 33, No. 2, 730--773 (2005; Zbl 1068.62079)

### References:

[1] | Barbieri, M. and Berger, J. (2004). Optimal predictive model selection. Ann. Statist 32 870–897. · Zbl 1092.62033 |

[2] | Bickel, P. and Zhang, P. (1992). Variable selection in non-parametric regression with categorical covariates. J. Amer. Statist. Assoc. 87 90–97. · Zbl 0763.62019 |

[3] | Breiman, L. (1992). The little bootstrap and other methods for dimensionality selection in regression: \(X\)-fixed prediction error. J. Amer. Statist. Assoc. 87 738–754. · Zbl 0850.62518 |

[4] | Chipman, H. (1996). Bayesian variable selection with related predictors. Canad. J. Statist. 24 17–36. · Zbl 0849.62032 |

[5] | Chipman, H. A., George, E. I. and McCulloch, R. E. (2001). The practical implementation of Bayesian model selection (with discussion). In Model Selection (P. Lahiri, ed.) 65–134. IMS, Beachwood, OH. |

[6] | Clyde, M., DeSimone, H. and Parmigiani, G. (1996). Prediction via orthogonalized model mixing. J. Amer. Statist. Assoc. 91 1197–1208. · Zbl 0880.62026 |

[7] | Clyde, M., Parmigiani, G. and Vidakovic, B. (1998). Multiple shrinkage and subset selection in wavelets. Biometrika 85 391–401. · Zbl 0938.62021 |

[8] | Efron, B., Hastie, T., Johnstone, I. and Tibshirani, R. (2004). Least angle regression (with discussion). Ann. Statist. 32 407–499. · Zbl 1091.62054 |

[9] | George, E. I. (1986). Minimax multiple shrinkage estimation. Ann. Statist. 14 188–205. JSTOR: · Zbl 0602.62041 |

[10] | George, E. I. and McCulloch, R. E. (1993). Variable selection via Gibbs sampling. J. Amer. Statist. Assoc. 88 881–889. |

[11] | Geweke, J. (1996). Variable selection and model comparison in regression. In Bayesian Statistics 5 (J. M. Bernardo, J. O. Berger, A. P. Dawid and A. F. M. Smith, eds.) 609–620. Oxford Univ. Press, New York. |

[12] | Hoerl, A. E. (1962). Application of ridge analysis to regression problems Chemical Engineering Progress 58 54–59. |

[13] | Hoerl, A. E. and Kennard, R. W. (1970). Ridge regression: Biased estimation for nonorthogonal problems. Technometrics 12 55–67. · Zbl 0202.17205 |

[14] | Ishwaran, H. (2004). Discussion of “Least angle regression,” by B. Efron, T. Hastie, I. Johnstone and R. Tibshirani. Ann. Statist. 32 452–457. · Zbl 1091.62054 |

[15] | Ishwaran, H. and Rao, J. S. (2000). Bayesian nonparametric MCMC for large variable selection problems. Unpublished manuscript. |

[16] | Ishwaran, H. and Rao, J. S. (2003). Detecting differentially expressed genes in microarrays using Bayesian model selection J. Amer. Statist. Assoc. 98 438–455. · Zbl 1041.62090 |

[17] | Ishwaran, H. and Rao, J. S. (2005). Spike and slab gene selection for multigroup microarray data. J. Amer. Statist. Assoc. · Zbl 1117.62363 |

[18] | Knight, K. and Fu, W. (2000). Asymptotics for lasso-type estimators Ann. Statist. 28 1356–1378. · Zbl 1105.62357 |

[19] | Kuo, L. and Mallick, B. K. (1998). Variable selection for regression models. Sankhyā Ser. B 60 65–81. · Zbl 0972.62016 |

[20] | Le Cam, L. and Yang, G. L. (1990). Asymptotics in Statistics : Some Basic Concepts . Springer, New York. · Zbl 0719.62003 |

[21] | Leeb, H. and Pötscher, B. M. (2003). The finite-sample distribution of post-model-selection estimators, and uniform versus non-uniform approximations. Econometric Theory 19 100–142. · Zbl 1032.62011 |

[22] | Lempers, F. B. (1971). Posterior Probabilities of Alternative Linear Models . Rotterdam Univ. Press. · Zbl 0246.62010 |

[23] | Mitchell, T. J. and Beauchamp, J. J. (1988). Bayesian variable selection in linear regression (with discussion). J. Amer. Statist. Assoc. 83 1023–1036. · Zbl 0673.62051 |

[24] | Pötscher, B. M. (1991). Effects of model selection on inference. Econometric Theory 7 163–185. |

[25] | Rao, C. R. and Wu, Y. (1989). A strongly consistent procedure for model selection in a regression problem. Biometrika 76 369–374. · Zbl 0669.62051 |

[26] | Rao, J. S. (1999). Bootstrap choice of cost complexity for better subset selection. Statist. Sinica 9 273–287. · Zbl 0928.62021 |

[27] | Shao, J. (1993). Linear model selection by cross-validation J. Amer. Statist. Assoc. 88 486–494. · Zbl 0773.62051 |

[28] | Shao, J. (1996). Bootstrap model selection. J. Amer. Statist. Assoc. 91 655–665. · Zbl 0869.62030 |

[29] | Shao, J. (1997). An asymptotic theory for linear model selection (with discussion). Statist. Sinica 7 221–264. · Zbl 1003.62527 |

[30] | Shao, J. and Rao, J. S. (2000). The GIC for model selection: A hypothesis testing approach. Linear models. J. Statist. Plann. Inference 88 215–231. · Zbl 0951.62056 |

[31] | Zhang, P. (1992). On the distributional properties of model selection criteria. J. Amer. Statist. Assoc. 87 732–737. · Zbl 0781.62106 |

[32] | Zhang, P. (1993). Model selection via multifold cross validation. Ann. Statist. 21 299–313. JSTOR: · Zbl 0770.62053 |

[33] | Zheng, X. and Loh, W.-Y. (1995). Consistent variable selection in linear models. J. Amer. Statist. Assoc. 90 151–156. · Zbl 0818.62060 |

[34] | Zheng, X. and Loh, W.-Y. (1997). A consistent variable selection criterion for linear models with high-dimensional covariates. Statist. Sinica 7 311–325. · Zbl 0880.62068 |

This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.