Bayesian model averaging for linear regression models. (English) Zbl 0888.62026
Summary: We consider the problem of accounting for model uncertainty in linear regression models. Conditioning on a single selected model ignores model uncertainty, and thus leads to the underestimation of uncertainty when making inferences about quantities of interest. A Bayesian solution to this problem involves averaging over all possible models (i.e., combinations of predictors) when making inferences about quantities of interest. This approach is often not practical.
We offer two alternative approaches. First, we describe an ad hoc procedure, “Occam’s window”, which indicates a small set of models over which a model average can be computed. Second, we describe a Markov chain Monte Carlo approach that directly approximates the exact solution. In the presence of model uncertainty, both of these model averaging procedures provide better predictive performance than any single model that might reasonably have been selected. In the extreme case where there are many candidate predictors but no relationship between any of them and the response, standard variable selection procedures often choose some subset of variables that yields a high $R^2$ and a highly significant overall $F$ value. In this situation, Occam’s window usually indicates the null model (or a small number of models including the null model) as the only one (or ones) to be considered thus largely resolving the problem of selecting significant models when there is no signal in the data. Software to implement our methods is available from StatLib.