zbMATH — the first resource for mathematics

Dependence modelling in ultra high dimensions with vine copulas and the graphical lasso. (English) Zbl 07058816
Summary: To model high dimensional data, Gaussian methods are widely used since they remain tractable and yield parsimonious models by imposing strong assumptions on the data. Vine copulas are more flexible by combining arbitrary marginal distributions and (conditional) bivariate copulas. Yet, this adaptability is accompanied by sharply increasing computational effort as the dimension increases. The proposed approach overcomes this burden and makes the first step into ultra high dimensional non-Gaussian dependence modelling by using a divide-and-conquer approach. First, Gaussian methods are applied to split datasets into feasibly small subsets and second, parsimonious and flexible vine copulas are applied thereon. Finally, these sub-models are reconciled into one joint model. Numerical results demonstrating the feasibility of the novel approach in moderate dimensions are provided. The ability of the approach to estimate ultra high dimensional non-Gaussian dependence models in thousands of dimensions is presented.

62-XX Statistics
Full Text: DOI
[1] Aas, K., Pair-copula constructions for financial applications: A review, Econometrics, 4, 4, (2016)
[2] Aas, K.; Czado, C.; Frigessi, A.; Bakken, H., Pair-copula constructions of multiple dependence, Insurance Math. Econom., 44, 182-198, (2009) · Zbl 1165.60009
[3] Arnborg, S.; Corneil, P.; Derek, G., Complexity of finding embeddings in a k-tree, SIAM J. Algebr. Discrete Methods, 8, 2, 277-284, (1987) · Zbl 0611.05022
[4] Bedford, T.; Cooke, R., Probability density decomposition for conditionally dependent random variables modeled by vines, Ann. Math. Artif. Intell., 32, 245-268, (2001) · Zbl 1314.62040
[5] Bedford, T.; Cooke, R., Vines - a new graphical model for dependent random variables, Ann. Statist., 30(4), 1031-1068, (2002) · Zbl 1101.62339
[6] Brechmann, E.; Czado, C.; Aas, K., Truncated regular vines in high dimensions with application to financial data, Canad. J. Statist., 40, 68-85, (2012) · Zbl 1274.62381
[7] Brechmann, E. C.; Schepsmeier, U., Modeling dependence with C- and D-Vine copulas: The R package CDVine, J. Stat. Softw., 52, 3, 1-27, (2013)
[8] Cooke, R. M., Markov and entropy properties of tree-and vine-dependent variables, (Proceedings of the ASA Section of Bayesian Statistical Science, Vol. 27, (1997))
[9] Czado, C.; Jeske, S.; Hofmann, M., Selection strategies for regular vine copulae, J. Soc. Fr. Statist., 154, 174-191, (2013) · Zbl 1316.62030
[10] Dempster, A. P., Covariance selection, Biometrics, 28, 1, 157-175, (1972)
[11] Dißmann, J.; Brechmann, E.; Czado, C.; Kurowicka, D., Selecting and estimating regular vine copulae and application to financial returns, Comput. Statist. Data Anal., 52, 1, 52-59, (2013) · Zbl 1400.62114
[12] Fan, Y.; Tang, C. Y., Tuning parameter selection in high dimensional penalized likelihood, J. R. Stat. Soc. Ser. B Stat. Methodol., 75, 3, 531-552, (2013)
[13] Friedman, J.; Hastie, T.; Tibshirani, R., Sparse inverse covariance estimation with the graphical lasso, Biostatistics, 9, 3, 432, (2008) · Zbl 1143.62076
[14] Ghalanos, A., 2015. rugarch: Univariate GARCH models. R package version 1.3-6.
[15] Gruber, L., Czado, C., 2015a. Bayesian Model selection of regular vine copulas. Preprint. · Zbl 1335.62048
[16] Gruber, L.; Czado, C., Sequential bayesian model selection of regular vine copulas, Bayesian Anal., 10, 937-963, (2015) · Zbl 1335.62048
[17] Harman, H. H., Modern Factor Analysis, (1976), University of Chicago press · Zbl 0161.39805
[18] Hastie, T.; Tibshirani, R.; Friedman, J., (The Elements of Statistical Learning. The Elements of Statistical Learning, Springer Series in Statistics, (2001), Springer New York Inc.: Springer New York Inc. New York, NY, USA) · Zbl 0973.62007
[19] Hobæk Haff, I.; Aas, K.; Frigessi, A.; Graziani, V. L., Structure learning in bayesian networks using regular vines, Comput. Statist. Data Anal., 101, 186-208, (2016) · Zbl 1466.62097
[20] Joe, H., Multivariate extreme-value distributions with applications to environmental data, Canad. J. Statist., 22, 1, 47-64, (1994) · Zbl 0804.62052
[21] Joe, H., Families of \(m\)-variate distributions with given margins and \(m(m - 1) / 2\) bivariate dependence parameters, (Rüschendorf, L.; Schweizer, B.; Taylor, M. D., Distributions with Fixed Marginals and Related Topics, (1996), Institute of Mathematical Statistics: Institute of Mathematical Statistics Hayward), 120-141
[22] Kovács, E., Szántai, T., 2016. On the connection between cherry-tree copulas and truncated R-vine copulas. arXiv preprint arXiv:1604.03269.
[23] Kraus, D., Czado, C., 2017. Growing simplified vine copula trees: improving Dißmann’s algorithm. arXiv preprint arXiv:1703.05203.
[24] Krumsiek, J.; Suhre, K.; Illig, T.; Adamski, J.; Theis, F. J., Gaussian graphical modeling reconstructs pathway reactions from high-throughput metabolomics data, BMC Syst. Biol., 5, 1, 21, (2011)
[25] Krupskii, P.; Joe, H., Factor copula models for multivariate data, J. Multivariate Anal., 120, 85-101, (2013) · Zbl 1280.62070
[26] Kurowicka, D., Optimal truncation of vines, (Dependence Modeling: Vine Copula Handbook, (2010), World Scientific), 233-247
[27] Kurowicka, D.; Joe, H., Dependence Modeling - Handbook on Vine Copulae, (2011), World Scientific Publishing Co.: World Scientific Publishing Co. Singapore
[28] Lauritzen, S. L., Graphical Models, (1996), University Press: University Press Oxford, England · Zbl 0907.62001
[29] Liu, H.; Han, F.; Yuan, M.; Lafferty, J.; Wasserman, L., High-dimensional semiparametric gaussian copula graphical models, Ann. Statist., 40, 4, 2293-2326, (2012) · Zbl 1297.62073
[30] Liu, H.; Roeder, K.; Wasserman, L., Stability approach to regularization selection (stars) for high dimensional graphical models, (Proceedings of the 23rd International Conference on Neural Information Processing Systems. Proceedings of the 23rd International Conference on Neural Information Processing Systems, NIPS’10, (2010), Curran Associates Inc.: Curran Associates Inc. USA), 1432-1440
[31] Mazumder, R.; Hastie, T., Exact covariance thresholding into connected components for large scale graphical lasso, J. Mach. Learn. Res., 13, 723-736, (2012) · Zbl 1283.62148
[32] McNeil, A. J.; Frey, R.; Embrechts, P., Quantitative Risk Management: Concepts, Techniques and Tools, (2006), Princeton University Press · Zbl 1089.91037
[33] Meinshausen, N.; Bühlmann, P., High-dimensional graphs and variable selection with the Lasso, Ann. Statist., 34, 3, 1436-1462, (2006) · Zbl 1113.62082
[34] Müller, D.; Czado, C., Representing sparse Gaussian DAGs as sparse R-Vines allowing for non-Gaussian dependence, J. Comput. Graph. Statist., 27, 2, 334-344, (2018)
[35] Müller, D.; Czado, C., Selection of sparse vine copulas in high dimensions with the lasso, Stat. Comput., 29, 269, (2019)
[36] Murray, J. S.; Dunson, D. B.; Carin, L.; Lucas, J. E., Bayesian Gaussian Copula factor models for mixed data, J. Amer. Statist. Assoc., 108, 502, 656-665, (2013), PMID: 23990691 · Zbl 06195968
[37] Oh, D. H.; Patton, A. J., Modeling dependence in high dimensions with factor copulas, J. Bus. Econom. Statist., 35, 1, 139-154, (2017)
[38] Prim, R. C., Shortest connection networks and some generalizations, Bell Syst. Tech. J., 36, 1389-1401, (1957)
[39] Ryan, J.A., Ulrich, J.M., 2017. quantmod: Quantitative Financial Modelling Framework. R package version 0.4-10.
[40] Schepsmeier, U., Stöber, J., Brechmann, E.C., Graeler, B., Nagler, T., Erhardt, T., 2017. VineCopula: Statistical Inference of Vine Copulas. R package version 2.1.2.
[41] Schwarz, G., Estimating the dimension of a model, Ann. Statist., 6, 2, 461-464, (1978) · Zbl 0379.62005
[42] Sklar, A., Fonctions dé repartition á n dimensions et leurs marges, Publ. Inst. Stat. Univ. Paris, 8, 229-231, (1959) · Zbl 0100.14202
[43] Stöber, J.; Joe, H.; Czado, C., Simplified pair copula constructions-limitations and extensions, J. Multivariate Anal., 119, 0, 101-118, (2013) · Zbl 1277.62139
[44] Tibshirani, R., Regression shrinkage and selection via the lasso, J. R. Stat. Soc. Ser. B Stat. Methodol., 58, 267-288, (1994) · Zbl 0850.62538
[45] Toh, H.; Horimoto, K., Inference of a genetic network by a combined approach of cluster analysis and graphical gaussian modeling, Bioinformatics, 18, 2, 287-297, (2002)
[46] Witten, D. M.; Friedman, J. H.; Noah, S., New insights and faster computations for the graphical lasso, J. Comput. Graph. Statist., 20, 4, 892-900, (2011)
[47] Yu, H.; Uy, W. I.T.; Dauwels, J., Modeling spatial extremes via ensemble-of-trees of pairwise copulas, IEEE Trans. Signal Process., 65, 3, 571-586, (2017) · Zbl 1414.94719
[48] Zhao, T., Li, X., Liu, H., Roeder, K., Lafferty, J., Wasserman, L., 2015. huge: High-Dimensional Undirected Graph Estimation. R package version 1.2.7. · Zbl 1283.68311
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.