Bayesian structured additive distributional regression with an application to regional income inequality in Germany. (English) Zbl 1454.62485

Ann. Appl. Stat. 9, No. 2, 1024-1052 (2015); correction ibid. 10, No. 2, 1135-1136 (2016).
Summary: We propose a generic Bayesian framework for inference in distributional regression models in which each parameter of a potentially complex response distribution and not only the mean is related to a structured additive predictor. The latter is composed additively of a variety of different functional effect types such as nonlinear effects, spatial effects, random coefficients, interaction surfaces or other (possibly nonstandard) basis function representations. To enforce specific properties of the functional effects such as smoothness, informative multivariate Gaussian priors are assigned to the basis function coefficients. Inference can then be based on computationally efficient Markov chain Monte Carlo simulation techniques where a generic procedure makes use of distribution-specific iteratively weighted least squares approximations to the full conditionals. The framework of distributional regression encompasses many special cases relevant for treating nonstandard response structures such as highly skewed nonnegative responses, overdispersed and zero-inflated counts or shares including the possibility for zero- and one-inflation. We discuss distributional regression along a study on determinants of labour incomes for full-time working males in Germany with a particular focus on regional differences after the German reunification. Controlling for age, education, work experience and local disparities, we estimate full conditional income distributions allowing us to study various distributional quantities such as moments, quantiles or inequality measures in a consistent manner in one joint model. Detailed guidance on practical aspects of model choice including the selection of several competing distributions for labour incomes and the consideration of different covariate effects on the income distribution complete the distributional regression analysis. We find that next to a lower expected income, full-time working men in East Germany also face a more unequal income distribution than men in the West, ceteris paribus.


62P20 Applications of statistics to economics
Full Text: DOI arXiv Euclid


[1] Arnold, B. C. (2008). Pareto and generalized Pareto distributions. In Modeling Income Distributions and Lorenz Curves (D. Chotikapanich, ed.) 119-145. Springer, New York. · Zbl 1151.91638
[2] Atkinson, A. B. (1975). The Economics of Inequality . Clarendon Press, Oxford.
[3] Autor, D. H., Katz, L. F. and Kearney, M. S. (2008). Trends in U.S. wage inequality: Revising the revisionists. Rev. Econ. Stat. 28 300-323.
[4] Bach, S., Corneo, G. and Steiner, V. (2009). From bottom to top: The entire income distribution in Germany, 1992-2003. Rev. Income Wealth 55 303-330.
[5] Belitz, C. and Lang, S. (2008). Simultaneous selection of variables and smoothing parameters in structured additive regression models. Comput. Statist. Data Anal. 53 61-81. · Zbl 1452.62029
[6] Belitz, C., Brezger, A., Klein, N., Kneib, T., Lang, S. and Umlauf, N. (2015). BayesX-software for Bayesian inference in structured additive regression models. Version 3.0. Available at .
[7] Biewen, M. (2000). Income inequality in Germany during the 1980s and 1990s. Rev. Income Wealth 46 1-19.
[8] Biewen, M. and Jenkins, S. P. (2005). A framework for the decomposition of poverty differences with an application to poverty differences between countries. Empir. Econ. 30 331-358.
[9] Brezger, A. and Lang, S. (2006). Generalized structured additive regression based on Bayesian P-splines. Comput. Statist. Data Anal. 50 967-991. · Zbl 1431.62308
[10] Card, D. E., Heining, J. and Kline, P. (2013). Workplace heterogeneity and the rise of German wage inequality. Q. J. Bus. Econ. 128 967-1015. · Zbl 1400.91278
[11] Dagum, C. (1977). A new model of personal income distribution: Specification and estimation. Economie Applicée 30 413-437.
[12] Dagum, C. (2008). A new model of personal income distribution: Specification and estimation. In Modeling Income Distributions and Lorenz Curves (D. Chotikapanich, ed.) 3-25. Springer, New York. · Zbl 1151.91644
[13] DiNardo, J., Fortin, N. M. and Lemieux, T. (1996). Labor market institutions and the distribution of wages, 1973-1992: A semiparametric approach. Econometrica 64 1001-1044.
[14] Donald, S. G., Green, D. A. and Paarsch, H. J. (2000). Differences in wage distributions between Canada and the United States: An application of a flexible estimator of distribution functions in the presence of covariates. Rev. Econ. Stud. 67 609-633. · Zbl 0963.62111
[15] Duclos, J.-Y., Esteban, J. and Ray, D. (2004). Polarization: Concepts, measurement, estimation. Econometrica 72 1737-1772. · Zbl 1142.62432
[16] Dunn, P. K. and Smyth, G. K. (1996). Randomized quantile residuals. J. Comput. Graph. Statist. 5 236-245.
[17] Dustmann, C., Ludsteck, J. and Schönberg, U. (2009). Revisiting the German wage structure. Q. J. Econ. 124 843-881. · Zbl 1180.62192
[18] Eilers, P. H. C. and Marx, B. D. (1996). Flexible smoothing with \(B\)-splines and penalties. Statist. Sci. 11 89-121. · Zbl 0955.62562
[19] Fahrmeir, L., Kneib, T. and Lang, S. (2004). Penalized structured additive regression for space-time data: A Bayesian perspective. Statist. Sinica 14 731-761. · Zbl 1073.62025
[20] Fahrmeir, L., Kneib, T., Lang, S. and Marx, B. (2013). Regression : Models , Methods and Applications . Springer, Heidelberg. · Zbl 1276.62046
[21] Fortin, N. M., Lemieux, T. and Firpo, S. (2011). Decomposition methods in economics. In Handbook of Labor Economics (O. Ashenfelter and D. E. Card, eds.) 4A 1-102. North-Holland, Amsterdam.
[22] Fuchs-Schündeln, N., Krueger, D. and Sommer, M. (2010). Inequality trends for Germany in the last two decades: A tale of two countries. Rev. Econ. Dyn. 13 103-132.
[23] Galvao, A. F., Lamarche, C. and Lima, L. R. (2013). Estimation of censored quantile regression for panel data with fixed effects. J. Amer. Statist. Assoc. 108 1075-1089. · Zbl 06224988
[24] Gamerman, D. (1997). Sampling from the posterior distribution in generalized linear mixed models. Stat. Comput. 7 57-68.
[25] Gneiting, T. (2011a). Making and evaluating point forecasts. J. Amer. Statist. Assoc. 106 746-762. · Zbl 1232.62028
[26] Gneiting, T. (2011b). Quantiles as optimal point forecasts. Int. J. Forecast. 27 197-207.
[27] Gneiting, T. and Raftery, A. E. (2007). Strictly proper scoring rules, prediction, and estimation. J. Amer. Statist. Assoc. 102 359-378. · Zbl 1284.62093
[28] Gneiting, T. and Ranjan, R. (2011). Comparing density forecasts using threshold- and quantile-weighted scoring rules. J. Bus. Econom. Statist. 29 411-422. · Zbl 1219.91108
[29] Gradín, C. (2000). Polarization by sub-populations in Spain, 1973-1991. Rev. Income Wealth 46 457-474.
[30] Greene, W. H. (2008). Econometric Analysis , 6th ed. Pearson Prentice Hall, Upper Saddle River.
[31] Hastie, T. and Tibshirani, R. (1993). Varying-coefficient models. J. Roy. Statist. Soc. Ser. B 55 757-796. · Zbl 0796.62060
[32] Heller, G., Stasinopoulos, D. and Rigby, R. (2006). The zero-adjusted inverse Gaussian distribution as a model for insurance data. In Proceedings of the 21 th International Workshop on Statistical Modelling (J. Hinde, J. Einbeck and J. Newell, eds.) 226-233. · Zbl 0747.00031
[33] Klasen, S. (2008). The efficiency of equity. Rev. Polit. Econ. 20 257-274.
[34] Kleiber, C. (1996). Dagum vs. Singh-Maddala income distributions. Econom. Lett. 57 39-44. · Zbl 0897.90060
[35] Kleiber, C. and Kotz, S. (2003). Statistical Size Distributions in Economics and Actuarial Sciences . Wiley, Hoboken, NJ. · Zbl 1044.62014
[36] Klein, N., Denuit, M., Lang, S. and Kneib, T. (2014). Nonlife ratemaking and risk management with Bayesian generalized additive models for location, scale, and shape. Insurance Math. Econom. 55 225-249. · Zbl 1296.62089
[37] Klein, N., Kneib, T. and Lang, S. (2015). Bayesian generalized additive models for location, scale and shape for zero-inflated and overdispersed count data. J. Amer. Statist. Assoc. 110 405-419. · Zbl 1373.62103
[38] Klein, N., Kneib, T., Klasen, S. and Lang, S. (2015a). Bayesian structured additive distributional regression for multivariate responses. J. R. Stat. Soc. Ser. C. Appl. Stat.
[39] Klein, N., Kneib, T., Lang, S. and Sohn, A. (2015b). Supplement to “Bayesian structured additive distributional regression with an application to regional income inequality in Germany.” . · Zbl 1454.62486
[40] Klein, N., Kneib, T., Lang, S. and Sohn, A. (2015c). Supplement to “Bayesian structured additive distributional regression with an application to regional income inequality in Germany.” . · Zbl 1454.62486
[41] Koenker, R. (2005). Quantile Regression. Econometric Society Monographs 38 . Cambridge Univ. Press, Cambridge. · Zbl 1111.62037
[42] Koenker, R. and Bassett, G. Jr. (1978). Regression quantiles. Econometrica 46 33-50. · Zbl 0373.62038
[43] Kohn, K. and Antonczyk, D. (2011). The aftermath of reunification: Sectoral transition, gender, and rising wage inequality in East Germany. IZA Discussion paper series No. 5708. Available at .
[44] Laio, F. and Tamea, S. (2007). Verification tools for probabilistic forecasts of continuous hydrological variables. Hydrol. Earth Syst. Sci. 11 1267-1277.
[45] Lang, S. and Brezger, A. (2004). Bayesian P-splines. J. Comput. Graph. Statist. 13 183-212.
[46] Lang, S., Umlauf, N., Wechselberger, P., Harttgen, K. and Kneib, T. (2014). Multilevel structured additive regression. Stat. Comput. 24 223-238. · Zbl 1325.62179
[47] Lemieux, T. (2006). The “Mincer equation.” Thirty years after Schooling, Experience, and Earnings. In Jacob Mincer : A Pioneer of Modern Labor Economics (S. Grossbard, ed.) 127-145. Kluwer Academic, Boston.
[48] Mincer, J. (1974). Schooling , Experience , and Earnings . Columbia Univ. Press, New York.
[49] Misztal, B. (2013). Trust in Modern Societies : The Search for the Bases of Social Order . Wiley, Hoboken.
[50] Morduch, J. and Sicular, T. (2002). Rethinking inequality decomposition, with evidence from rural China. Econ. J. 112 93-106.
[51] Newey, W. K. and Powell, J. L. (1987). Asymmetric least squares estimation and testing. Econometrica 55 819-847. · Zbl 0625.62047
[52] Osband, K. and Reichelstein, S. (1985). Information-eliciting compensation schemes. J. Public Econ. 27 107-115.
[53] Piketty, T. and Saez, E. (2007). Income and wage inequality in the United States, 1913-2002. In Top Incomes Over the Twentieth Century (A. B. Atkinson and T. Piketty, eds.) 141-225. Oxford Univ. Press, Oxford.
[54] Pudney, S. (1999). On some statistical methods for modelling the incidence of poverty. Oxf. Bull. Econ. Stat. 61 385-408.
[55] Rigby, R. A. and Stasinopoulos, D. M. (2005). Generalized additive models for location, scale and shape. J. Roy. Statist. Soc. Ser. C 54 507-554. · Zbl 1490.62201
[56] Rue, H. and Held, L. (2005). Gaussian Markov Random Fields : Theory and Applications. Monographs on Statistics and Applied Probability 104 . Chapman & Hall/CRC, Boca Raton, FL. · Zbl 1093.60003
[57] Ruppert, D., Wand, M. P. and Carroll, R. J. (2003). Semiparametric Regression. Cambridge Series in Statistical and Probabilistic Mathematics 12 . Cambridge Univ. Press, Cambridge. · Zbl 1038.62042
[58] Salem, A. B. Z. and Mount, T. D. (1974). A convenient descriptive model of income distribution. Econometrica 42 1115-1127.
[59] Sarabia, J. M. (2008). Parametric Lorenz curves: Models and applications. In Modeling Iincome Distributions and Lorenz Curves (D. Chotikapanich, ed.) 167-190. Springer, New York. · Zbl 1151.91658
[60] Scheipl, F., Fahrmeir, L. and Kneib, T. (2012). Spike-and-slab priors for function selection in structured additive regression models. J. Amer. Statist. Assoc. 107 1518-1532. · Zbl 1258.62082
[61] Schnabel, S. K. and Eilers, P. H. C. (2009). Optimal expectile smoothing. Comput. Statist. Data Anal. 53 4168-4177. · Zbl 1453.62192
[62] Silber, J. (1999). Introduction-thirty years of intensive research on income inequality measurement. In Handbook of Income Inequality Measurement (J. Silber, ed.) 1-18. Kluwer Academic, Boston.
[63] Skidelsky, R. (2010). Keynes : The Return of the Master , 1st ed. Public Affairs, New York.
[64] Sobotka, F. and Kneib, T. (2012). Geoadditive expectile regression. Comput. Statist. Data Anal. 56 755-767. · Zbl 1241.62058
[65] Spiegelhalter, D. J., Best, N. G., Carlin, B. P. and van der Linde, A. (2002). Bayesian measures of model complexity and fit. J. R. Stat. Soc. Ser. B Stat. Methodol. 64 583-639. · Zbl 1067.62010
[66] Statistisches Bundesamt (2012). Verbraucherpreisindizes für Deutschland-Lange Reihen ab 1948, Preise.
[67] Umlauf, N., Klein, N., Lang, S. and Zeileis, A. (2014). bamlss: Bayesian additive models for location scale and shape (and beyond). R package Version 0.1-1. Available at .
[68] Wagner, G. G., Frick, J. R. and Schupp, J. (2007). The German socio-economic panel study (SOEP)-scope, evolution and enhancements. Schmollers Jahrbuch 127 139-169.
[69] Wolfson, M. C. (1994). When inequalities diverge. Am. Econ. Rev. 84 353-358.
[70] Wood, S. N. (2004). Stable and efficient multiple smoothing parameter estimation for generalized additive models. J. Amer. Statist. Assoc. 99 673-686. · Zbl 1117.62445
[71] Wood, S. N. (2008). Fast stable direct fitting and smoothness selection for generalized additive models. J. R. Stat. Soc. Ser. B Stat. Methodol. 70 495-518. · Zbl 05563356
[72] Yu, K. and Moyeed, R. A. (2001). Bayesian quantile regression. Statist. Probab. Lett. 54 437-447. · Zbl 0983.62017
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.