×

zbMATH — the first resource for mathematics

Copula index for detecting dependence and monotonicity between stochastic signals. (English) Zbl 1453.62492
Summary: This paper introduces a nonparametric copula-based index for detecting the strength and monotonicity structure of linear and nonlinear statistical dependence between pairs of random variables or stochastic signals. Our index, termed Copula Index for Detecting Dependence and Monotonicity (CIM), satisfies several desirable properties of measures of association, including most of Rényi’s properties, the data processing inequality (DPI), and consequently self-equitability. Synthetic data simulations reveal that the statistical power of CIM compares favorably to other state-of-the-art measures of association that are proven to satisfy the DPI. Simulation results with real-world data reveal CIM’s unique ability to detect the monotonicity structure among stochastic signals to find interesting dependencies in large datasets. Additionally, simulations show that CIM compares favorably to estimators of mutual information when discovering Markov network structure.
MSC:
62H05 Characterization and structure theory for multivariate probability distributions; copulas
62G30 Order statistics; empirical distribution functions
Software:
ARACNE; ITE; NetBenchmark
PDF BibTeX XML Cite
Full Text: DOI
References:
[1] Bedford, T.; Cooke, R., Vines-a new graphical model for dependent random variables, Ann. Stat., 30, 4, 1031-1068 (2002) · Zbl 1101.62339
[2] Bellot, P.; Olsen, C.; Salembier, P.; Oliveras-Vergés, A.; Meyer, P., Netbenchmark: a bioconductor package for reproducible benchmarks of gene regulatory network inference, BMC Bioinform., 16, 1, 312 (2015)
[3] Ben Hassine, M.; Mili, L.; Karra, K., A copula statistic for measuring nonlinear multivariate dependence with application to feature selection in machine learning, Int. Adv. Comput.Sci. Appl., 8, 7 (2017)
[4] Bonett, D.; Wright, T., Sample size requirements for estimating pearson, kendall and spearman correlations, Psychometrika, 65, 1, 23-28 (2000) · Zbl 1291.62195
[5] Chang, Y.; Li, Y.; Ding, A.; Dy, J., A robust-equitable copula dependence measure for feature selection, Proceedings of the 19th International Conference on Artificial Intelligence and Statistics (2016)
[6] Cover, T.; Thomas, J., Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing) (2006), Wiley-Interscience
[7] Darbellay, G.; Vajda, I., Estimation of the information by an adaptive partitioning of the observation space, IEEE Trans. Inf. Theory, 45, 4, 1315-1321 (1999) · Zbl 0957.94006
[8] Darsow, W.; Nguyen, B.; Olsen, E., Copulas and Markov processes, Ill. J. Math., 36, 4, 600-642 (1992) · Zbl 0770.60019
[9] Dengler, B., On the Asymptotic Behaviour of the Estimator of Kendalls Tau (2010), T.U. Munich, PhD dissertation
[10] Elidan, G., Copula Bayesian networks, Advances in Neural Information Processing Systems 23 (2010), Curran Associates, Inc.
[12] Genest, C.; Nešlehová, J., A primer on copulas for count data, ASTIN Bull. (2007) · Zbl 1274.62398
[13] Helsel, D.; Hirsch, R., Statistical Methods in Water Resources Techniques of Water Resources Investigations, Book 4, chapter A3 (2002), U.S. Geological Survey
[14] Kandasamy, K.; Krishnamurthy, A.; Poczos, B.; Wasserman, L.; Robins, J., Nonparametric Von Mises Estimators for Entropies, Divergences and Mutual Informations, (Cortes, C.; Lawrence, N. D.; Lee, D. D.; Sugiyama, M.; Garnett, R., Advances in Neural Information Processing Systems 28 (2015), Curran Associates, Inc.), 397-405
[15] Kendall, M., A new measure of rank correlation, Biometrika, 30, 1-2, 81-93 (1938) · Zbl 0019.13001
[16] Kendall, M., The treatment of ties in ranking problems, Biometrika, 33, 3, 239-251 (1945) · Zbl 0063.03216
[17] Kinney, J.; Atwal, G., Equitability, mutual information, and the maximal information coefficient, Proc. Natl. Acad. Sci., 111, 9, 3354-3359 (2014) · Zbl 1359.62213
[18] Kraskov, A.; Stögbauer, H.; Grassberger, P., Estimating mutual information, Phys. Rev. E, 69 (2004)
[19] Liebscher, E., Copula-based dependence measures for piecewise monotonicity, Depend. Model., 5, 1, 198-220 (2017) · Zbl 06839230
[20] Lopez-Paz, D.; Henning, P.; Schölkopf, B., The randomized dependence coefficient, Advances in Neural Information Processing Systems 26 (2013), Curran Associates, Inc.
[21] Madsen, L.; Birkes, D., Simulating dependent discrete data, J. Stat. Comput. Simul., 83, 4, 677-691 (2013) · Zbl 1431.62250
[22] Margolin, A.; Nemenman, I.; Basso, K.; Wiggins, C.; Stolovitzky, G.; Favera, R.; Califano, A., ARACNE: An algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context, BMC Bioinform., 7, 1, S7 (2006)
[23] Meyer, P.; Kontos, K.; Lafitte, F.; Bontempi, G., Information-theoretic inference of large transcriptional regulatory networks, EURASIP J. Bioinform. Syst.Biol. (2007)
[24] Nelsen, R., An Introduction to Copulas (2006), Springer-Verlag: Springer-Verlag New York · Zbl 1152.62030
[25] Nešlehová, J., On rank correlation measures for non-continuous random variables, J. Multivar. Anal., 98, 3, 544-567 (2007) · Zbl 1107.62047
[26] Pearson, K., Note on regression and inheritance in the case of two parents, Proc. R. Soc. Lond., 58, 240-242 (1895)
[27] Peng, H.; Long, F.; Ding, C., Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy, IEEE Trans. Pattern Anal. Mach. Intell., 27, 8, 1226-1238 (2005)
[28] Rényi, A., On measures of dependence, Acta Math. Acad. Sci. Hung., 10, 3, 441-451 (1959) · Zbl 0091.14403
[29] Reshef, D.; Reshef, Y.; Finucane, H.; Grossman, S.; McVean, G.; Turnbaugh, P.; Lander, E.; Mitzenmacher, M.; Sabeti, P., Detecting novel associations in large data sets, Science, 334, 6062, 1518-1524 (2011) · Zbl 1359.62216
[31] Scarsini, M., On measures of concordance., Stochastica, 8, 3, 201-218 (1984) · Zbl 0582.62047
[33] Szabó, Z., Information theoretical estimators toolbox, J. Mach. Learn. Res., 15, 283-287 (2014) · Zbl 1317.68190
[34] Székely, G.; Rizzo, M.; Bakirov, N., Measuring and testing dependence by correlation of distances, Ann. Stat., 35, 6, 2769-2794 (2007) · Zbl 1129.62059
[35] Vandenhende, F.; Lambert, P., Improved rank-based dependence measures for categorical data, Stat. Probab. Lett., 63, 2, 157-163 (2003) · Zbl 1116.62362
[36] Zhu, X.-W.; Liu, S.-S.; Qin, L.-T.; Chen, F.; Liu, H.-L., Modeling non-monotonic doseresponse relationships: model evaluation and hormetic quantities exploration, Ecotoxicol. Environ.Saf., 89, 130-136 (2013)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.