Spline based survival model for credit risk modeling. (English) Zbl 1346.91252

Summary: Survival modeling has been adapted in retail banking because of its capability to analyze the censored data. It is an important tool for credit risk scoring, stress testing and credit asset evaluation. In this paper, we introduce a regression spline based discrete time survival model. The flexibility of spline function allows us to model the nonlinear and irregular shape of the hazard functions. By incorporating the regression spline into the multinomial logistic regression, this approach complements the existing Cox model. From a practical perspective, the logistic regression is relatively easy to understand and implement, and the simple parametric form is especially advantageous for predictive scoring. Using a credit card dataset, we demonstrate how to build a cubic regression spline based survival model. We also compare the performance of spline based discrete time survival model with the classical Cox model, our results show the spline based survival model can provide similar statistical explanatory and improve the prediction accuracy for attrition model which has low event rate.


91G40 Credit risk
91G70 Statistical methods; risk measures
Full Text: DOI


[1] Allison, P. D., Survival analysis using SAS: A practical guide, (2010), SAS Institute Inc Cary, NC
[2] Andersen, P. K., Repeated assessment of risk factors in survival analysis, Statistical Methods in Medical Research, 1, 297-315, (1992)
[3] Banasik, J.; Crook, J. N.; Thomas, L. C., Not if but when will borrowers default, Journal of the Operational Research Society, 50, 1185-1190, (1999) · Zbl 1054.90531
[4] Begg, C. B.; Gray, R., Calculation of polychotomous logistic regression parameters using individualized regressions, Biometrika, 71, 1, 11-18, (1984) · Zbl 0533.62089
[5] Belloti, T.; Crook, J., Forecasting and stress testing credit card default using dynamic models, International Journal of Forecasting, 29, 563-574, (2013)
[6] Bellotti, T.; Crook, J., Credit scoring with macroeconomic variables using survival analysis, The Journal of the Operational Research Society, 60, 1699-1707, (2009) · Zbl 1196.91064
[7] Bellotti, T.; Crook, J. N., Retail credit stress testing using a discrete hazard model with macroeconomic factors, Journal of the Operational Research Society., 65, 340-350, (2013)
[8] Bradley, A. P., The use of the area under the ROC curve in the evaluation of machine learning algorithms, Pattern Recognition, 30, 1145-1159, (1997)
[9] Breslow, N. E., Covariance analysis of censored survival data, Biometrics, 30, 89-99, (1974)
[10] Brown, C. C., On the use of indicator variables for studying the time-dependence of parameters in a response-time model, Biometrics, 31, 863-872, (1975) · Zbl 0342.62070
[11] Cameron, A. C.; Trivedi, P. K., Microeconometrics: Methods and applications, (2005), Cambridge University Press · Zbl 1156.62092
[12] Chen, T. (2015). American household credit card debt statistics: 2015. https://www.nerdwallet.com/blog/credit-card-data/average-credit-card-debt-household/.
[13] Cox, D. R., Regression models and life-tables (with discussion), Journal of Royal Statistic Society, Series B, 74, 187-220, (1972) · Zbl 0243.62041
[14] Curphey, M. (2015). UK credit and debit card statistics. http://uk.creditcards.com/credit-card-news/uk-britain-credit-debit-card-statistics-international.php.
[15] Efron, B., The efficiency of cox’s likelihood function for censored data, Journal of the American Statistical Association, 72, 557-565, (1977) · Zbl 0373.62020
[16] Gross, A. J.; Clark, V. A., Survival distributions: Reliability applications in the biomedical sciences, (1975), John Wiley & Sons, Inc New York · Zbl 0334.62044
[17] Gross, D. B.; Souleles, N. S., An empirical analysis of personal bankruptcy and delinquency, The Review of Financial Studies, 15, 319-417, (2002)
[18] Hair, J.; Anderson, R.; Tatham, R.; Black, W., Multivariate data analysis, (2009), Macmillan New York
[19] Harrell, F. E., Regression modeling strategies: With applications to linear models, logistic regression, and survival Analysis, (2001), Springer, New York, Inc · Zbl 0982.62063
[20] Hastie, T.; Tibshirani, R.; Friedman, J., The elements of statistical learning: Data mining, inference, and prediction, (2001), Spinger-Verlag · Zbl 0973.62007
[21] van Heerde, H. J.; Leeflang, P. S.H.; Wittink, D. R., Semiparametric analysis to estimate the deal effect curve, Journal of Marketing Research, 38, 2, 197-215, (2001)
[22] Hosmer, D. W.; Lemeshow, S.; Sturdivant, R. X., Applied logistic regression, (2013), John Wiley and Sons, Inc · Zbl 1276.62050
[23] Kalbfleisch, J. D.; Prentice, R. L., The statistical analysis of failure time data, (1980), Wiley New York · Zbl 0504.62096
[24] Kano, H.; Nakata, H.; Martin, C. F., Optimal curve Fitting and smoothing using normalized uniform b-splines: a tool for studying complex systems, Applied Mathematics and Computation, 169, 96-128, (2005) · Zbl 1121.65304
[25] Klein, J. P.; Moeschberger, M., Survival analysis techniques for censored and truncated data, (2003), Springer-Verlag, New York, Inc · Zbl 1011.62106
[26] Laurini, M. P.; Mouraa, M., Constrained smoothing b-splines for the term structure of interest rates, Insurance: Mathematics and Economics, 46, 339-350, (2010) · Zbl 1231.91457
[27] Potts, W., Survival data mining: predictive hazard modeling for customer history data, Technical Report, (2004), Data Miners Inc
[28] Smith, P. L., Splines as a useful and convenient statistical tool, Journal of American Statistical Association, 33, 57-62, (1979)
[29] Steiner, W. J.; Brezger, A.; Belitz, C., Flexible estimation of price response function using retail scanner data, Journal of Retailing and Consumer Services, 14, 383-393, (2007)
[30] Stepanova, M.; Thomas, L., Survival analysis methods for personal loan data, Operations Research, 50, 277-289, (2002) · Zbl 1163.91521
[31] Thomas, L. C., Consumer credit models: Pricing, profit, and portfolios, (2009), Oxford University Press
[32] Wold, S., Spline functions in data analysis, Technometrics, 6, 1-11, (1974) · Zbl 0285.65010
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.