×

Estimating promotion effects in email marketing using a large-scale cross-classified Bayesian joint model for nested imbalanced data. (English) Zbl 07656985

Summary: We consider a large-scale, cross-classified nested (CRON) joint model for modeling customer responses to opening, clicking, and purchasing from promotion emails. Our logistic regression-based joint model contains crossing of promotions and customer effects and allows estimation of the heterogeneous effects of different promotion emails, after adjusting for customer preferences, attributes, and historical behaviors. Using data from an email marketing campaign of an apparel retailer, we exhibit the varying effects of promotions not only based on the contents of the email but also across the three different stages, viz. open, click, and purchase of the conversion funnel. We conduct Bayesian estimation of the parameters in the joint model by using a block Metropolis-Hastings algorithm that not only incorporates nested subsampling to tackle the severe imbalance between conversions and no conversions but also uses additive transformation-based modifications of random walk Metropolis to scale estimation for large numbers of customers. We extend our approach to a segmented cross-classified nested (SCRON) joint model that encompasses the possibility of varying promotion effects across different customer segments. The resultant high-dimensional model is estimated using spike-and-slab priors on the promotion and customer segment interactions. Our nested joint model accounts for the correlations in customer preferences across the conversion funnel. Based on the promotion estimates from the model, we demonstrate how marketers can use different priced, nonpriced, and combination of price and nonprice promotions to increase brand awareness or increase purchases. Comparing estimates from CRON and SCRON models, we display the benefits of targeted marketing by using email promotion lists which are separately optimized for the different customer segments.

MSC:

62Pxx Applications of statistics
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] ALDIGHIERI, R. (2019). Marketer email tracker 2019. https://dma.org.uk/uploads/misc/marketers-email-tracker-2019.pdf.
[2] ALQUIER, P., FRIEL, N., EVERITT, R. and BOLAND, A. (2016). Noisy Monte Carlo: Convergence of Markov chains with approximate transition kernels. Stat. Comput. 26 29-47. · Zbl 1342.60122 · doi:10.1007/s11222-014-9521-x
[3] ANDERSON, J. A. (1972). Separate sample logistic discrimination. Biometrika 59 19-35. · Zbl 0231.62080 · doi:10.1093/biomet/59.1.19
[4] BAI, R., ROCKOVA, V. and GEORGE, E. I. (2021). Spike-and-slab meets LASSO: A review of the spike-and-slab LASSO. In Handbook of Bayesian Variable Selection 81-108.
[5] BISSIRI, P. G., HOLMES, C. C. and WALKER, S. G. (2016). A general framework for updating belief distributions. J. R. Stat. Soc. Ser. B. Stat. Methodol. 78 1103-1130. · Zbl 1414.62039 · doi:10.1111/rssb.12158
[6] BULT, J. R. and WANSBEEK, T. (1995). Optimal selection for direct mail. Mark. Sci. 14 378-394.
[7] Castillo, I., Schmidt-Hieber, J. and van der Vaart, A. (2015). Bayesian linear regression with sparse priors. Ann. Statist. 43 1986-2018. · Zbl 1486.62197 · doi:10.1214/15-AOS1334
[8] CHAWLA, N. V., JAPKOWICZ, N. and KOTCZ, A. (2004). Special issue on learning from imbalanced data sets. ACM SIGKDD Explor. Newsl. 6 1-6.
[9] DEY, K. K. and BHATTACHARYA, S. (2019). A brief review of optimal scaling of the main MCMC approaches and optimal scaling of additive TMCMC under non-regular cases. Braz. J. Probab. Stat. 33 222-266. · Zbl 1478.60208 · doi:10.1214/17-bjps386
[10] DEY, K. KR. and BHATTACHARYA, S. (2016). On geometric ergodicity of additive and multiplicative transformation-based Markov Chain Monte Carlo in high dimensions. Braz. J. Probab. Stat. 30 570-613. · Zbl 1359.60095 · doi:10.1214/15-BJPS295
[11] Dutta, S. and Bhattacharya, S. (2014). Markov chain Monte Carlo based on deterministic transformations. Stat. Methodol. 16 100-116. · Zbl 1486.62004 · doi:10.1016/j.stamet.2013.08.006
[12] FADER, P. S., HARDIE, B. G. and LEE, K. L. (2005). Counting your customers the easy way: An alternative to the Pareto/NBD model. Mark. Sci. 24 275-284.
[13] FITHIAN, W. and HASTIE, T. (2014). Local case-control sampling: Efficient subsampling in imbalanced data sets. Ann. Statist. 42 1693-1724. · Zbl 1305.62096 · doi:10.1214/14-AOS1220
[14] FORTE, D. (2019). Consumers want more personalized emails, but retailers aren’t delivering. Available at https://multichannelmerchant.com/marketing/consumers-want-personalized-emails-retailers-arent-delivering/.
[15] GAO, K. (2017). Scalable Estimation and Inference for Massive Linear Mixed Models with Crossed Random Effects. ProQuest LLC, Ann Arbor, MI. Thesis (Ph.D.)-Stanford University.
[16] GAO, K. and OWEN, A. (2017). Efficient moment calculations for variance components in large unbalanced crossed random effects models. Electron. J. Stat. 11 1235-1296. · Zbl 1362.62044 · doi:10.1214/17-EJS1236
[17] GEISSER, S. (1993). Predictive Inference: An Introduction. Monographs on Statistics and Applied Probability 55. CRC Press, New York. · doi:10.1007/978-1-4899-4467-2
[18] GOPALAKRISHNAN, A. and PARK, Y.-H. (2021). The impact of coupons on the visit-to-purchase funnel: Theory and empirical evidence. Mark. Sci..
[19] HUANG, Z. andGELMAN, A. (2005). Sampling for Bayesian computation with large datasets. Available at SSRN 1010107.
[20] Ishwaran, H. and Rao, J. S. (2005). Spike and slab variable selection: Frequentist and Bayesian strategies. Ann. Statist. 33 730-773. · Zbl 1068.62079 · doi:10.1214/009053604000001147
[21] JAMES, G., WITTEN, D., HASTIE, T. and TIBSHIRANI, R. (2013). An Introduction to Statistical Learning. Springer Texts in Statistics 103. Springer, New York. With applications in R. · Zbl 1281.62147 · doi:10.1007/978-1-4614-7138-7
[22] JOHNDROW, J. E., SMITH, A., PILLAI, N. and DUNSON, D. B. (2019). MCMC for imbalanced categorical data. J. Amer. Statist. Assoc. 114 1394-1403. · Zbl 1428.62114 · doi:10.1080/01621459.2018.1505626
[23] KAMAKURA, W. A. and KANG, W. (2007). Chain-wide and store-level analysis for cross-category management. J. Retail. 83 159-170.
[24] KAPNER, S. (2017). Retailers’ emails are misfires for many holiday shoppers.
[25] Kleijn, B. J. K. and Van der Vaart, A. W. (2012). The Bernstein-von-Mises theorem under misspecification. Electron. J. Stat. 6 354-381. · Zbl 1274.62203 · doi:10.1214/12-EJS675
[26] KUHN, M. and JOHNSON, K. (2013). Applied Predictive Modeling. Springer, New York. · Zbl 1306.62014 · doi:10.1007/978-1-4614-6849-3
[27] LEE (2019). Digital vs traditional media—where should you invest your marketing dollars? Available at https://hookdpromotions.com/digital-vs-traditional-media-where-should-you-invest-your-marketing-dollars/.
[28] LEONE, C. (2020). How much should you budget for marketing in 2020. Available at https://www.webstrategiesinc.com/blog/how-much-budget-for-misc-marketing-in-2014.
[29] MALSINER-WALLI, G. and WAGNER, H. (2011). Comparing spike and slab priors for Bayesian variable selection. Aust. J. Stat.
[30] MCCAFFREY, D. F., GRIFFIN, B. A., ALMIRALL, D., SLAUGHTER, M. E., RAMCHAND, R. and BURGETTE, L. F. (2013). A tutorial on propensity score estimation for multiple treatments using generalized boosted models. Stat. Med. 32 3388-3414. · doi:10.1002/sim.5753
[31] MCCULLOCH, C. (2008). Joint modelling of mixed outcome types using latent variables. Stat. Methods Med. Res. 17 53-73. · Zbl 1154.62339 · doi:10.1177/0962280207081240
[32] MOORE, D. J. and LEE, S. P. (2012). How advertising influences consumption impulses. J. Advert. 41 107-120.
[33] MUKHOPADHYAY, S., KAR, W. and MUKHERJEE, G. (2023). Supplement to “Estimating promotion effects in email marketing using a large-scale cross-classified Bayesian joint model for nested imbalanced data.” https://doi.org/10.1214/22-AOAS1638SUPP
[34] OSINGA, E. C., LEEFLANG, P. S. and WIERINGA, J. E. (2010). Early marketing matters: A time-varying parameter approach to persistence modeling. J. Mark. Res. 47 173-185.
[35] OWEN, A. B. (2007). Infinitely imbalanced logistic regression. J. Mach. Learn. Res. 8 761-773. · Zbl 1222.62094
[36] PAPASPILIOPOULOS, O., ROBERTS, G. O. and ZANELLA, G. (2020). Scalable inference for crossed random effects models. Biometrika 107 25-40. · Zbl 1435.62283 · doi:10.1093/biomet/asz058
[37] PARK, C. H., PARK, Y.-H. and SCHWEIDEL, D. A. (2018). The effects of mobile promotions on customer purchase dynamics. Int. J. Res. Mark. 35 453-470.
[38] PERRIN, N. (2019). Email marketing 2019: Still a leading touchpoint for marketers and consumers alike. Available at: https://content-na1.emarketer.com/email-marketing-2019.
[39] Polson, N. G., Scott, J. G. and Windle, J. (2013). Bayesian inference for logistic models using Pólya-Gamma latent variables. J. Amer. Statist. Assoc. 108 1339-1349. · Zbl 1283.62055 · doi:10.1080/01621459.2013.829001
[40] Prentice, R. L. and Pyke, R. (1979). Logistic disease incidence models and case-control studies. Biometrika 66 403-411. · Zbl 0428.62078 · doi:10.1093/biomet/66.3.403
[41] RIZOPOULOS, D. (2012). Joint Models for Longitudinal and Time-to-Event Data: With Applications in R. CRC Press, Boca Raton. · Zbl 1284.62032
[42] RIZOPOULOS, D. and GHOSH, P. (2011). A Bayesian semiparametric multivariate joint model for multiple longitudinal outcomes and a time-to-event. Stat. Med. 30 1366-1380. · doi:10.1002/sim.4205
[43] RIZOPOULOS, D. and LESAFFRE, E. (2014). Introduction to the special issue on joint modelling techniques. Stat. Methods Med. Res. 23 3-10. · doi:10.1177/0962280212445800
[44] Ročková, V. and George, E. I. (2018). The spike-and-slab LASSO. J. Amer. Statist. Assoc. 113 431-444. · Zbl 1398.62186 · doi:10.1080/01621459.2016.1260469
[45] ROSSI, P. E., MCCULLOCH, R. E. and ALLENBY, G. M. (1996). The value of purchase history data in target marketing. Mark. Sci. 15 321-340.
[46] RUPPERT, D. (2002). Selecting the number of knots for penalized splines. J. Comput. Graph. Statist. 11 735-757. · doi:10.1198/106186002321018768
[47] SACHS, M., SEN, D., LU, J. and DUNSON, D. (2020). Posterior computation with the Gibbs zig-zag sampler. Preprint. Available at arXiv:2004.04254.
[48] SAHNI, N. S., WHEELER, S. C. and CHINTAGUNTA, P. (2018). Personalization in email marketing: The role of noninformative advertising content. Mark. Sci. 37 236-258.
[49] SAHNI, N. S., ZOU, D. and CHINTAGUNTA, P. K. (2017). Do targeted discount offers serve as advertising? Evidence from 70 field experiments. Manage. Sci. 63 2688-2705.
[50] SCOTT, S. L., BLOCKER, A. W., BONASSI, F. V., CHIPMAN, H. A., GEORGE, E. I. and MCCULLOCH, R. E. (2016). Bayes and big data: The consensus Monte Carlo algorithm. Int. J. Manag. Sci. Eng. Manag. 11 78-88.
[51] SEETHARAMAN, P. B. and CHINTAGUNTA, P. K. (2003). The proportional hazard model for purchase timing: A comparison of alternative specifications. J. Bus. Econom. Statist. 21 368-382. · doi:10.1198/073500103288619025
[52] SEN, D., SACHS, M., LU, J. and DUNSON, D. B. (2020). Efficient posterior sampling for high-dimensional imbalanced logistic regression. Biometrika 107 1005-1012. · Zbl 1457.62221 · doi:10.1093/biomet/asaa035
[53] STATISTA (2019). Dossier on e-mail marketing in the U.S. and worldwide.
[54] WANG, H., ZHU, R. and MA, P. (2018). Optimal subsampling for large sample logistic regression. J. Amer. Statist. Assoc. 113 829-844. · Zbl 1398.62196 · doi:10.1080/01621459.2017.1292914
[55] WU, J., LI, K. J. and LIU, J. S. (2018). Bayesian inference for assessing effects of email marketing campaigns. J. Bus. Econom. Statist. 36 253-266. · doi:10.1080/07350015.2016.1141096
[56] ZHANG, X., KUMAR, V. and COSGUNER, K. (2017). Dynamically managing a profitable email marketing program. J. Mark. Res. 54 851-866.
[57] ZHANG, X., ZHOU, Y., MA, Y., CHEN, B.-C., ZHANG, L. and AGARWAL, D. (2016). Glmix: Generalized linear mixed models for large-scale response prediction. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 363-372
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.