##
**Regression models for mixed discrete and continuous responses with potentially missing values.**
*(English)*
Zbl 0904.62082

Summary: A likelihood-based method for analyzing mixed discrete and continuous regression models is proposed. We focus on marginal regression models, that is, models in which the marginal expectation of the response vector is related to covariates by known link functions. The proposed model is based on an extension of the general location model of I. Olkin and R. F. Tate [Ann. Math. Stat. 32, 448-465 (1961; Zbl 0113.35101)] and can accommodate missing responses. When there are no missing data, our particular choice of parameterization yields maximum likelihood estimates of the marginal mean parameters that are robust to misspecification of the association between the responses. This robustness property does not, in general, hold for the case of incomplete data.

There are a number of potential benefits of a multivariate approach over separate analyses of the distinct responses. First, a multivariate analysis can exploit the correlation structure of the response vector to address intrinsically multivariate questions. Second, multivariate test statistics allow for control over the inflation of the type I error that results when separate analyses of the distinct responses are performed without accounting for multiple comparisons. Third, it is generally possible to obtain more precise parameter estimates by accounting for the association between the responses. Finally, separate analyses of the distinct responses may be difficult to interpret when there is nonresponse because different sets of individuals contribute to each analysis. Furthermore, separate analyses can introduce bias when the missing responses are missing at random (MAR). A multivariate analysis can circumvent both of these problems. The proposed methods are applied to two biomedical datasets.

There are a number of potential benefits of a multivariate approach over separate analyses of the distinct responses. First, a multivariate analysis can exploit the correlation structure of the response vector to address intrinsically multivariate questions. Second, multivariate test statistics allow for control over the inflation of the type I error that results when separate analyses of the distinct responses are performed without accounting for multiple comparisons. Third, it is generally possible to obtain more precise parameter estimates by accounting for the association between the responses. Finally, separate analyses of the distinct responses may be difficult to interpret when there is nonresponse because different sets of individuals contribute to each analysis. Furthermore, separate analyses can introduce bias when the missing responses are missing at random (MAR). A multivariate analysis can circumvent both of these problems. The proposed methods are applied to two biomedical datasets.

### MSC:

62J12 | Generalized linear models (logistic models) |

62P10 | Applications of statistics to biology and medical sciences; meta analysis |

62J99 | Linear inference, regression |