zbMATH — the first resource for mathematics

Using mixed integer programming for matching in an observational study of kidney failure after surgery. (English) Zbl 1258.62119
Summary: This article presents a new method for optimal matching in observational studies based on mixed integer programming. Unlike widely used matching methods based on network algorithms, which attempt to achieve covariate balance by minimizing the total sum of distances between treated units and matched controls, this new method achieves covariate balance directly, either by minimizing both the total sum of distances and a weighted sum of specific measures of covariate imbalance, or by minimizing the total sum of distances while constraining the measures of imbalance to be less than or equal to certain tolerances. The inclusion of these extra terms in the objective function or the use of these additional constraints explicitly optimizes or constrains the criteria that will be used to evaluate the quality of the match. For example, the method minimizes or constrains differences in univariate moments, such as means, variances, and skewness; differences in multivariate moments, such as correlations between covariates; differences in quantiles; and differences in statistics, such as the Kolmogorov-Smirnov statistic, to minimize the differences in both location and shape of the empirical distributions of the treated units and matched controls. While balancing several of these measures, it is also possible to impose constraints for exact and near-exact matching, and fine and near-fine balance for more than one nominal covariate, whereas network algorithms can finely or near-finely balance only a single nominal covariate.
From a practical standpoint, this method eliminates the guesswork involved in current optimal matching methods, and offers a controlled and systematic way of improving covariate balance by focusing the matching efforts on certain measures of covariate imbalance and their corresponding weights or tolerances. A matched case-control study of acute kidney injury after surgery among Medicare patients illustrates these features in detail. A new R package called mipmatch implements the method.

62P10 Applications of statistics to biology and medical sciences; meta analysis
92C50 Medical applications (general)
90C11 Mixed integer programming
62-04 Software, source code, etc. for problems pertaining to statistics
65C60 Computational problems in statistics (MSC2010)
mipmatch; R; Rcplex
Full Text: DOI
[1] DOI: 10.1111/j.1468-0262.2006.00655.x · Zbl 1112.62042 · doi:10.1111/j.1468-0262.2006.00655.x
[2] DOI: 10.1007/BF01584237 · Zbl 0461.90069 · doi:10.1007/BF01584237
[3] Bertsimas D., Introduction to Linear Optimization (1997)
[4] Bertsimas D., Optimization Over Integers (2005)
[5] DOI: 10.1007/978-0-387-35514-6_2 · doi:10.1007/978-0-387-35514-6_2
[6] DOI: 10.1007/s10479-006-0091-y · Zbl 1213.90011 · doi:10.1007/s10479-006-0091-y
[7] DOI: 10.1002/9780470316542 · doi:10.1002/9780470316542
[8] DOI: 10.1080/01621459.1999.10473858 · doi:10.1080/01621459.1999.10473858
[9] DOI: 10.1002/0471445428 · Zbl 1034.62113 · doi:10.1002/0471445428
[10] DOI: 10.1093/biostatistics/5.2.263 · Zbl 1096.62078 · doi:10.1093/biostatistics/5.2.263
[11] DOI: 10.1198/016214504000000647 · Zbl 1117.62349 · doi:10.1198/016214504000000647
[12] ——, R News 7 pp 18– (2007)
[13] DOI: 10.1093/biomet/asn004 · Zbl 1437.62485 · doi:10.1093/biomet/asn004
[14] DOI: 10.1198/106186006X137047 · doi:10.1198/106186006X137047
[15] DOI: 10.1198/jcgs.2010.08162 · doi:10.1198/jcgs.2010.08162
[16] DOI: 10.1214/aos/1176342623 · Zbl 0273.62025 · doi:10.1214/aos/1176342623
[17] DOI: 10.1093/pan/mpr013 · doi:10.1093/pan/mpr013
[18] DOI: 10.1093/oxfordjournals.aje.a010011 · doi:10.1093/oxfordjournals.aje.a010011
[19] DOI: 10.1002/nav.3800020109 · doi:10.1002/nav.3800020109
[20] DOI: 10.1093/biomet/70.2.510 · doi:10.1093/biomet/70.2.510
[21] DOI: 10.1198/016214501753208573 · Zbl 1047.62112 · doi:10.1198/016214501753208573
[22] Linderoth J. T., Wiley Encyclopedia of Operations Research and Management Science pp 3239– (2010)
[23] DOI: 10.1198/tast.2011.08294 · Zbl 05886130 · doi:10.1198/tast.2011.08294
[24] DOI: 10.1080/01621459.1989.10478868 · doi:10.1080/01621459.1989.10478868
[25] ——, Observational Studies (2002)
[26] ——, Design of Observational Studies (2010) · Zbl 1308.62005
[27] DOI: 10.1198/016214506000001059 · Zbl 1284.62670 · doi:10.1198/016214506000001059
[28] DOI: 10.1093/biomet/70.1.41 · Zbl 0522.62091 · doi:10.1093/biomet/70.1.41
[29] ——, The American Statistician 39 pp 33– (1985)
[30] DOI: 10.1093/biostatistics/2.2.217 · Zbl 1097.62568 · doi:10.1093/biostatistics/2.2.217
[31] Rubin D. B., Journal of the American Statistical Association 74 pp 318– (1979)
[32] DOI: 10.1097/ALN.0b013e31821d6c81 · doi:10.1097/ALN.0b013e31821d6c81
[33] DOI: 10.1097/SLA.0b013e31825375ef · doi:10.1097/SLA.0b013e31825375ef
[34] Snedecor G. W., Statistical Methods (1980)
[35] DOI: 10.1056/NEJMra054415 · doi:10.1056/NEJMra054415
[36] DOI: 10.1214/09-STS313 · Zbl 1328.62007 · doi:10.1214/09-STS313
[37] DOI: 10.1111/j.1541-0420.2011.01691.x · Zbl 1274.62910 · doi:10.1111/j.1541-0420.2011.01691.x
[38] DOI: 10.1198/tas.2011.11072 · doi:10.1198/tas.2011.11072
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.