zbMATH — the first resource for mathematics

Directional penalties for optimal matching in observational studies. (English) Zbl 1448.62080
Summary: Multivariate matching in observational studies tends to view covariate differences symmetrically: a difference in age of 10 years is thought equally problematic whether the treated subject is older or younger than the matched control. If matching is correcting an imbalance in age, such that treated subjects are typically older than controls, then the situation in need of correction is asymmetric: a matched pair with a difference in age of 10 years is much more likely to have an older treated subject and a younger control than the opposite. Correcting the bias may be easier if matching tries to avoid the typical case that creates the bias. We describe several easily used, asymmetric, directional penalties and illustrate how they can improve covariate balance in a matched sample. The investigator starts with a matched sample built in a conventional way, then diagnoses residual covariate imbalances in need of reduction, and achieves the needed reduction by slightly altering the distance matrix with directional penalties, creating a new matched sample. Unlike penalties commonly used in matching, a directional penalty can go too far, reversing the direction of the bias rather than reducing the bias, so the magnitude of the directional penalty matters and may need adjustment. Our experience is that two or three adjustments, guided by balance diagnostics, can substantially improve covariate balance, perhaps requiring fifteen minutes effort sitting at the computer. We also explore the connection between directional penalties and a widely used technique in integer programming, namely Lagrangian relaxation of problematic linear side constraints in a minimum cost flow problem. In effect, many directional penalties are Lagrange multipliers, pushing a matched sample in the direction of satisfying a linear constraint that would not be satisfied without penalization. The method and example are in an R package DiPs at CRAN.

62H12 Estimation in multivariate analysis
62P10 Applications of statistics to biology and medical sciences; meta analysis
62D20 Causal inference from observational studies
Full Text: DOI