Kernel-estimated nonparametric overlap-based syncytial clustering. (English) Zbl 07255153

Summary: Commonly-used clustering algorithms usually find ellipsoidal, spherical or other regular-structured clusters, but are more challenged when the underlying groups lack formal structure or definition. Syncytial clustering is the name that we introduce for methods that merge groups obtained from standard clustering algorithms in order to reveal complex group structure in the data. Here, we develop a distribution-free fully-automated syncytial clustering algorithm that can be used with \(k\)-means and other algorithms. Our approach estimates the cumulative distribution function of the normed residuals from an appropriately fit \(k\)-groups model and calculates the estimated nonparametric overlap between each pair of clusters. Groups with high pairwise overlap are merged as long as the estimated generalized overlap decreases. Our methodology is always a top performer in identifying groups with regular and irregular structures in several datasets and can be applied to datasets with scatter or incomplete records. The approach is also used to identify the distinct kinds of gamma ray bursts in the Burst and Transient Source Experiment 4Br catalog and the distinct kinds of activation in a functional Magnetic Resonance Imaging study.


68T05 Learning and adaptive systems in artificial intelligence
Full Text: arXiv Link


[1] F. Alimoglu. Combining multiple classifiers for pen-based handwritten digit recognition. Master’s thesis, Institute of Graduate Studies in Science and Engineering, Bogazici University, 1996.
[2] F. Alimoglu and E. Alpaydin. Methods of combining multiple classifiers based on different representations for pen-based handwriting recognition. InProceedings of the Fifth Turkish Artificial Intelligence and Artificial Neural Networks Symposium (TAINN 96), Istanbul, Turkey, 1996.
[3] I. Almod´ovar-Rivera and R. Maitra. RFASTfMRI: Fast adaptive smoothing and thresholding for improved activation detection in low-signal fMRI, 2019.R Package, URL http://github.com/ialmodovar/RFASTfMRI.
[4] I. A. Almod´ovar-Rivera and R. Maitra. FAST adaptive smoothed thresholding for improved activation detection in low-signal fMRI.IEEE Transactions on Medical Imaging, 38(12): 2821-2828, 2019. doi: 10.1109/TMI.2019.2915052.
[5] A. Azzalini. A note on the estimation of a distribution function and quantiles by a kernel method.Biometrika, 68(1):326-328, 1981.
[6] P. A. Bandettini, A. Jesmanowicz, E. C. Wong, and J. S. Hyde. Processing strategies for time-course data sets in functional mri of the human brain.Magnetic Resonance in Medicine, 30:161-173, 1993.
[7] J.-P. Baudry and G. Celeux.RmixmodCombi: Combining Mixture Components for Clustering, 2014. URLhttps://CRAN.R-project.org/package=RmixmodCombi. R package version 1.0.
[8] J.-P. Baudry, A. E. Raftery, G. Celeux, K. Lo, and R. Gottardo.Combining mixture components for clustering.Journal of Computational and Graphical Statistics, 19(2):332 - 353, 2010.
[9] J. W. Belliveau, D. N. Kennedy, R. C. McKinstry, B. R. Buchbinder, R. M. Weisskoff, M. S. Cohen, J. M. Vevea, T. J. Brady, and B. R. Rosen. Functional mapping of the human visual cortex by magnetic resonance imaging.Science, 254:716-719, 1991.
[10] N. S. Berry and R. Maitra. Tik-means: Transformation-infused k-means clustering for skewed groups.Statistical Analysis and Data Mining: The ASA Data Science Journal, 12(3):223-233, 2019.
[11] T. Bouezmarni and O. Scaillet. Consistency of asymmetric kernel density estimators and smoothed histograms with application to income data.Econometric Theory, 21(02):390- 412, 2005.
[12] C. Bouveyron, G. Celeux, B. T. Murphy, and A. E. Raftery.Model-Based Clustering and Classification for Data Science: With Applications in R. Cambridge Series in Statistical and Probabilistic Mathematics, 2019.
[13] R. P. Browne and P. D. McNicholas. A mixture of generalized hyperbolic distributions. Canadian Journal of Statistics, 43(2):176-198, 2015.
[14] R. J. Campello, D. Moulavi, and J. Sander. Density-based clustering based on hierarchical density estimates. InPacific-Asia conference on knowledge discovery and data mining, pages 160-172. Springer, 2013.
[15] H. Chang and D.-Y. Yeung. Robust path-based spectral clustering.Pattern Recognition, 41 (1):191 - 203, 2008. ISSN 0031-3203. doi: https://doi.org/10.1016/j.patcog.2007.04.010. URLhttp://www.sciencedirect.com/science/article/pii/S0031320307002038.
[16] S. Chattopadhyay and R. Maitra.Gaussian-mixture-model-based cluster analysis finds five kinds of gamma-ray bursts in the BATSE catalogue.Monthly Notices of the Royal Astronomical Society, 469(3):3374-3389, 2017. doi: 10.1093/mnras/stx1024.
[17] S. Chattopadhyay and R. Maitra. Multivariate t-mixture-model-based cluster analysis of BATSE catalogue establishes importance of all observed parameters, confirms five distinct ellipsoidal sub-populations of gamma-ray bursts.Monthly Notices of the Royal Astronomical Society, 481(3):3196-3209, 07 2018. ISSN 0035-8711. doi: 10.1093/mnras/sty1940.
[18] T. Chattopadhyay, R. Misra, A. K. Chattopadhyay, and M. Naskar. Statistical evidence for three classes of gamma-ray bursts.Astrophysical Journal, 667(2):1017, 2007. doi: https://doi.org/10.1086/520317.
[19] A. Chaturvedi, P. E. Green, and J. D. Caroll.K-modes clustering.Journal of Classification, 18:35-55, 2001.
[20] S. X. Chen. Probability density function estimation using gamma kernels.Annals of the Institute of Statistical Mathematics, 52(3):471-480, 2000.
[21] J.-P. Dezalay, C. Barat, R. Talon, R. Syunyaev, O. Terekhov, and A. Kuznetsov. Short cosmic events - A subset of classical GRBs? In W. S. Paciesas and G. J. Fishman, editors, American Institute of Physics Conference Series, volume 265 ofAmerican Institute of Physics Conference Series, pages 304-309, 1992.
[22] I. Dhillon, Y. Guan, and B. Kulis. A unified view of kernel k-means, spectral clustering and graph cuts. Technical Report TR-04-25, University of Texas at Austin, 2004.
[23] K. S. Dorman and R. Maitra. An efficientk-modes algorithm for clustering categorical datasets.ArXiv e-prints:2006.03936, 2020.
[24] V. A. Epanechnikov. Non-parametric estimation of a multivariate probability density.Theory of Probability and its Applications, 14:153158, 1969. doi: 10.1137/1114019.
[25] M. Ester, H.-P. Kriegel, J. Sander, X. Xu, et al. A density-based algorithm for discovering clusters in large spatial databases with noise. InKDD-96, volume 96, pages 226-231, 1996.
[26] B. S. Everitt, S. Landau, and M. Leesem.Cluster Analysis (4th ed.). Hodder Arnold, New York, 2001.
[27] E. Forgy. Cluster analysis of multivariate data: efficiency vs. interpretability of classifications.Biometrics, 21:768-780, 1965.
[28] M. Forina and E. Tiscornia. Pattern recognition methods in the prediction of italian olive oil origin by their fatty acid content.Annali di Chimica, 72:143-155, 1982.
[29] M. Forina, C. Armanino, S. Lanteri, and E. Tiscornia. Classification of olive oils from their fatty acid composition. InFood Research and Data Analysis, pages 189-214. Applied Science Publishers, London, 1983.
[30] M. Forina, R. Leardi, and S. Lanteri. PARVUS - an extendible package for data exploration, classification and correlation, 1988.
[31] S. D. Forman, J. D. Cohen, M. Fitzgerald, W. F. Eddy, M. A. Mintun, and D. C. Noll. Improved assessment of significant activation in functional magnetic resonance imaging (fmri): Use of a cluster-size threshold.Magnetic Resonance in Medicine, 33:636-647, 1995.
[32] C. Fraley and A. E. Raftery. How many clusters? which cluster method? answers via model-based cluster analysis.Computer Journal, 41:578-588, 1998.
[33] C. Fraley and A. E. Raftery. Model-based clustering, discriminant analysis, and density estimation.Journal of the American Statistical Association, 97:611-631, 2002.
[34] B. C. Franczak, R. P. Browne, and P. D. McNicholas. Mixtures of shifted asymmetric Laplace distributions.IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(6):1149-1157, 2014.
[35] B. C. Franczak, R. P. Browne, P. D. McNicholas, and K. L. Burak.MixSAL: Mixtures of Multivariate Shifted Asymmetric Laplace (SAL) Distributions, 2018. URLhttps: //CRAN.R-project.org/package=MixSAL. R package version 1.0.
[36] A. L. Fred and A. K. Jain. Combining multiple clusterings using evidence accumulation. IEEE transactions on pattern analysis and machine intelligence, 27(6):835-850, 2005.
[37] K. J. Friston, P. Jezzard, and R. Turner. Analysis of functional MRI time-series.Human Brain Mapping, 1:153-171, 1994.
[38] K. J. Friston, A. P. Holmes, K. J. Worsley, J.-B. Poline, C. D. Frith, and R. S. J. Frackowiak. Statistical parametric maps in functional imaging: A general linear approach.Human Brain Mapping, 2:189-210, 1995.
[39] C. R. Genovese, N. A. Lazar, and T. E. Nichols. Thresholding of statistical maps in functional neuroimaging using the false discovery rate:.Neuroimage, 15:870-878, 2002.
[40] Z. Ghahramani and G. E. Hinton. The EM algorithm for factor analyzers. Technical Report CRG-TR-96-1, University of Toronto, Toronto, Canada, 1997.
[41] A. Gionis, H. Mannila, and P. Tsaparas. Clustering aggregation.ACM Transactions on Knowledge Discovery from Data (TKDD), 1(1):4, 2007.
[42] G. H. Glover. Deconvolution of impulse response in event-related BOLD fMRI.Neuroimage, 9:416-429, 1999.
[43] M. Hahsler and M. Piekenbrock.dbscan: Density Based Clustering of Applications with Noise (DBSCAN) and Related Algorithms, 2018. URLhttps://CRAN.R-project.org/ package=dbscan. R package version 1.1-3.
[44] J. A. Hartigan and M. A. Wong. Ak-means clustering algorithm.Applied Statistics, 28: 100-108, 1979.
[45] C. Hennig. Methods for merging Gaussian mixture components.Advances in Data Analysis and Classification, 2010. doi: 10.1007/s11634-010-0058-3.
[46] A. Hinneburg and D. Keim. Cluster discovery methods for large databases: from the past to the future. InProceedings of the ACM SIGMOD International Conference on the Management of Data, 1999.
[47] P. Horton and K. Nakai. A probablistic classification system for predicting the cellular localization sites of proteins.Intelligent Systems in Molecular Biology, pages 109-115, 1985.
[48] Z. Huang. Clustering large data sets with mixed numeric and categorical values. InProceedings of the First Pacific Asia Knowledge Discovery and Data Mining Conference, page 2134, Singapore, 1997. World Scientific.
[49] Z. Huang. Extensions to thek-means algorithm for clustering large data sets with categorical values.Data Mining and Knowledge Discovery, 2:283304, 1998.
[50] L. Hubert and P. Arabie. Comparing partitions.Journal of Classification, 2:193-218, 1985.
[51] P. Jaccard. ‘Etude comparative de la distribution florale dans une portion des alpes et des jura.Bulletin del la Soci‘et‘e Vaudoise des Sciences Naturelles, 37:547579, 1901.
[52] A. Jain and R. Dubes.Algorithms for clustering data. Prentice Hall, Englewood Cliffs, NJ, 1988.
[53] A. K. Jain and M. H. C. Law. Data clustering: A users dilemma. In S. K. Pal, B. S., and B. S., editors,Pattern Recognition and Machine Intelligence. PReMI 2005, volume 3776 ofLecture Notes in Computer Science, pages 1-10, Berlin, Heidelberg, 2005. Springer.
[54] Y. Jeon and J. H. T. Kim. A gamma kernel density estimation for insurance loss data. Insurance: Mathematics and Economics, 53:569-579, 2013. doi: http://dx.doi.org/10. 1016/j.insmatheco.2013.08.009.
[55] S. Johnson. Hierarchical clustering schemes.Psychometrika, 32:3:241-254, 1967.
[56] L. Kaufman and P. J. Rousseuw.Finding Groups in Data. John Wiley & Sons, New York, 1990.
[57] W. J. Krzanowski and Y. Lai. A criterion for determining the number of groups in a data set using sum-of-squares clustering.Biometrics, pages 23-34, 1988.
[58] K. K. Kwong, J. W. Belliveau, D. A. Chesler, I. E. Goldberg, R. M. Weisskoff, B. P. Poncelet, D. N. Kennedy, B. E. Hoppel, M. S. Cohen, R. Turner, H.-M. Cheng, T. J. Brady, and B. R. Rosen. Dynamic magnetic resonance imaging of human brain activity during primary sensory stimulation.Proceedings of the National Academy of Sciences of the United States of America, 89:5675-5679, 1992.
[59] N. A. Lazar.The Statistical Analysis of Functional MRI Data. Springer, 2008.
[60] A. Lithio and R. Maitra. An efficient k-means-type algorithm for clustering datasets with incomplete records.Statistical Analysis and Data Mining: The ASA Data Science Journal, 11(6):296-311, 2018.
[61] S. Lloyd. Least squares quantization in PCM.Information Theory, IEEE Transactions on, 28(2):129-137, 1982.
[62] J. MacQueen. Some methods for classification and analysis of multivariate observations. Proceedings of the Fifth Berkeley Symposium, 1:281-297, 1967.
[63] P. C. Mahalanobis. On the generalised distance in statistics.Proceedings of the National Institute of Sciences of India, 2(1):4955, 1936.
[64] R. Maitra. Clustering massive datasets with applications to software metrics and tomography.Technometrics, 43(3):336-346, 2001.
[65] R. Maitra.A statistical perspective to data mining.Journal of the Indian Society of Probability and Statistics, 6:28-77, 2002.
[66] R. Maitra. Initializing partition-optimization algorithms.IEEE/ACM Transactions on Computational Biology and Bioinformatics, 6:144-157, 2009a. doi: 10.1109/TCBB.2007. 70244.
[67] R. Maitra. Assessing certainty of activation or inactivation in test-retest fMRI studies. Neuroimage, 47(1):88-97, 2009b.
[68] R. Maitra. A re-defined and generalized percent-overlap-of-activation measure for studies of fMRI reproducibility and its use in identifying outlier activation maps.Neuroimage, 50(1):124-135, 2010.
[69] R. Maitra and V. Melnykov. Simulating data to study performance of finite mixture modeling and clustering algorithms.Journal of Computational and Graphical Statistics, 19 (2):354-376, 2010. doi: 10.1198/jcgs.2009.08054.
[70] R. Maitra and I. P. Ramler. Clustering in the presence of scatter.Biometrics, 65:341 - 352, 2009.
[71] R. Maitra, S. R. Roys, and R. P. Gullapalli. Test-retest reliability estimation of functional MRI data.Magnetic Resonance in Medicine, 48:62-70, 2002.
[72] R. Maitra, V. Melnykov, and S. Lahiri. Bootstrapping for significance of compact clusters in multi-dimensional datasets.Journal of the American Statistical Association, 107(497): 378-392, 2012. doi: http://dx.doi.org/10.1080/01621459.2011.646935.
[73] E. P. Mazets, S. V. Golenetskii, V. N. Ilinskii, V. N. Panov, R. L. Aptekar, I. A. Gurian, M. P. Proskura, I. A. Sokolov, Z. I. Sokolova, and T. V. Kharitonova. Catalog of cosmic gamma-ray bursts from the KONUS experiment data. I.Astrophysics and Space Science, 80:3-83, Nov. 1981. doi: 10.1007/BF00649140.
[74] G. McLachlan and D. Peel.Finite Mixture Models. John Wiley and Sons, Inc., New York, 2000.
[75] G. J. McLachlan and K. E. Basford.Mixture Models: Inference and Applications to Clustering. Marcel Dekker, New York, 1988.
[76] P. D. McNicholas.Mixture model-based classification. Chapman and Hall/CRC, 2016.
[77] P. D. McNicholas and T. B. Murphy. Parsimonious Gaussian mixture models.Statistics and Computing, 18(3):285-296, 2008.
[78] P. D. McNicholas, A. ElSherbiny, A. F. McDaid, and T. B. Murphy.pgmm: Parsimonious Gaussian Mixture Models, 2018. URLhttps://CRAN.R-project.org/package=pgmm. R package version 1.2.3.
[79] V. Melnykov. Merging mixture components for clustering through pairwise overlap.Journal of Computational and Graphical Statistics, 25(1):66-90, 2016.
[80] V. Melnykov and R. Maitra. Finite mixture models and model-based clustering.Statistics Surveys, 4:80-116, 2010. ISSN 1935-7516. doi: 10.1214/09-SS053.
[81] V. Melnykov and R. Maitra. CARP: Software for fishing out good clustering algorithms. Journal of Machine Learning Research, 12:69 - 73, 2011.
[82] V. Melnykov, W.-C. Chen, and R. Maitra. MixSim: An R package for simulating data to study performance of clustering algorithms.Journal of Statistical Software, 51(12):1-25, 2012. URLhttp://www.jstatsoft.org/v51/i12/.
[83] C. D. Michener and R. R. Sokal. A quantitative approach to a problem in classification. Evolution, 11:130-162, 1957.
[84] S. Mukherjee, E. D. Feigelson, G. Jogesh Babu, F. Murtagh, C. Fraley, and A. Raftery. Three types of gamma-ray bursts.Astrophyical Journal, 508:314-327, Nov. 1998. doi: 10.1086/306386.
[85] K. Nakai. UCI machine learning repository, 1996. URLhttp://archive.ics.uci.edu/ml.
[86] K. Nakai and M. Kinehasa. Expert sytem for predicting protein localization sites in gramnegative bacteria.PROTEINS: Structure, Function, and Genetics, 11:95-110, 1991.
[87] R. B. Nelsen.An Introduction to Copulas. Springer, New York, 2 edition, 2006.
[88] D. J. Newman, S. Hettich, C. L. Blake, and C. J. Merz. UCI repository of machine learning databases, 1998.
[89] J. P. Norris, T. L. Cline, U. D. Desai, and B. J. Teegarden. Frequency of fast, narrow gamma-ray bursts.Nature, 308:434, Mar. 1984. doi: 10.1038/308434a0.
[90] S. Ogawa, T. M. Lee, A. S. Nayak, and P. Glynn. Oxygenation-sensitive contrast in magnetic resonance image of rodent brain at high magnetic fields.Magnetic Resonance in Medicine, 14:68-78, 1990.
[91] E. Parzen. On estimation of a probability density function and mode.The Annals of Mathematical Statistics, 33(3):1065, 1962. doi: 10.1214/aoms/1177704472.
[92] T. L. Pedersen, S. Hughes, and X. Qiu.densityClust: Clustering by Fast Search and Find of Density Peaks, 2017. URLhttps://CRAN.R-project.org/package=densityClust. R package version 0.3.
[93] A. D. Peterson, A. P. Ghosh, and R. Maitra. Mergingk-means with hierarchical clustering for identifying general-shaped groups.Stat, 7(1):e172, 2018. doi: 10.1002/sta4.172.
[94] T. Piran. The physics of gamma-ray bursts.Rev. Mod. Phys., 76:1143-1210, Jan 2005. doi: 10.1103/RevModPhys.76.1143.
[95] R Development Core Team.R: A Language and Environment for Statistical Computing.R Foundation for Statistical Computing, Vienna, Austria, 2018.URLhttp: //www.R-project.org. ISBN 3-900051-07-0.
[96] A. E. Raftery and N. Dean. Variable selection for model-based clustering.Journal of the American Statistical Association, 101:168-178, 2006.
[97] D. B. Ramey. Nonparametric clustering techniques. InEncyclopedia of Statistical Science, volume 6, pages 318-319. Wiley, New York, 1985.
[98] R.-D. Reiss. Nonparametric estimation of smooth distribution functions.Scandinavian Journal of Statistics, pages 116-119, 1981.
[99] A. Rodriguez and A. Laio. Clustering by fast search and find of density peaks.Science, 344 (6191):1492-1496, 2014.
[100] M. Rosenblatt. Remarks on some nonparametric estimates of a density function.The Annals of Mathematical Statistics, 27(3):832, 1956. doi: 10.1214/aoms/1177728190.
[101] L. R¨uschendorf.Mathematical Risk Analysis. Springer-Verlag, Berlin Heidelberg, 2013. doi: 10.1007/978-3-642-33590-7.
[102] D. C. S. Aeberhard and O. de Vel. Comparison of classifiers in high dimensional settings. Technical Report 92-02, Department of Computer Science and Department of Mathematics and Statistics, James Cook University of North Queensland, 1992.
[103] F. S. Samaria and A. C. Harter. Parameterization of a stochastic model for human face identification. InProceedings of the Second IEEE Workshop on Applications of Computer Vision, pages 138-142, Sarasota, Florida, 1994.
[104] M. P. Sampat, Z. Wang, S. Gupta, A. C. Bovik, and M. K. Markey. Complex wavelet structural similarity: A new image similarity index.IEEE Transactions on Image Processing, 18(11):2385-2401, 2009. doi: 10.1109/TIP.2009.2025923.
[105] O. Scaillet. Density estimation using inverse and reciprocal inverse Gaussian kernels.Nonparametric Statistics, 16(1-2):217-226, 2004.
[106] G. Schwarz. Estimating the dimensions of a model.Annals of Statistics, 6:461-464, 1978.
[107] B. W. Silverman.Density Estimation for Statistics and Data Analysis.Chapman & Hall/CRC, London, 1986. ISBN 0-412-24620-1.
[108] M. ´Smieja and M. Wiercioch. Constrained clustering with a complex cluster structure. Advances in Data Analysis and Classification, 11(3):493-518, 2017.
[109] D. Steinley. Properties of the Hubert-Arabie adjusted Rand index.Psychological methods, 9(3):386, 2004.
[110] W. Stuetzle and R. Nugent. A generalized single linkage method for estimating the cluster tree of a density.Journal of Computational and Graphical Statistics, 2010. doi: 10.1198/ jcgs.2009.07049.
[111] C. A. Sugar and G. M. James. Finding the number of clusters in a dataset.Journal of the American Statistical Association, 98(463), 2003.
[112] B. Thirion, G. Varoquaux, E. Dohmatob, and J.-B. Poline.Which fMRI clustering gives good brain parcellations?Frontiers in Neuroscience, 8:167, 2014. ISSN 1662453X. doi: 10.3389/fnins.2014.00167. URLhttps://www.frontiersin.org/article/ 10.3389/fnins.2014.00167.
[113] D. Titterington, A. Smith, and U. Makov.Statistical Analysis of Finite Mixture Distributions. John Wiley & Sons, Chichester, U.K., 1985.
[114] C. Tortora, A. ElSherbiny, R. P. Browne, B. C. Franczak, , P. D. McNicholas, and D. D. Amos.MixGHD: Model Based Clustering, Classification and Discriminant Analysis Using the Mixture of Generalized Hyperbolic Distributions, 2019.URLhttps: //CRAN.R-project.org/package=MixGHD. R package version 2.3.2.
[115] U. von Luxburg. A tutorial on spectral clustering.Statistics and Computing, 17(4):395-416, December 2007.
[116] K. Wagstaff. Clustering with missing values: No imputation required. InClassification, clustering, and data mining applications, pages 649-658. Springer, 2004.
[117] M. P. Wand and M. C. Jones.Kernel Smoothing. Chapman & Hall/CRC, London, 1995. ISBN 0-412-55270-1.
[118] R. Xu and D. C. Wunsch.Clustering. John Wiley & Sons, NJ, Hoboken, 2009. 53
[119] E.-J. Yeoh, M. E. Ross, S. A. Shurtleff, W. Williams, D. Patel, R. Mahfouz, F. G. Behm, S. C. Raimondi, M. V. Relling, A. Patel, C. Cheng, D. Campana, D. Wilkins, X. Zhou, J. Li, H. Liu, C.-H. Pui, W. E. Evans, C. Naeve, L. Wong, and J. R. Downing. Classification, subtype discovery, and prediction of outcome in pediatric acute lymphoblastic leukemia by gene expression profiling.Cancer Cell, 1(2):133 - 143, 2002. ISSN 1535-6108. doi: https://doi.org/10.1016/S1535-6108(02)00032-6.
[120] C. T. Zahn. Graph-theoretical methods for detecting and describing gestalt clusters.IEEE Transactions on computers, 100(1):68-86, 1971.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.