zbMATH — the first resource for mathematics

Macros to conduct tests of multimodality in SAS. (English) Zbl 07192717
Summary: The Dip Test of Unimodality and Silverman’s Critical Bandwidth Test are two popular tests to determine if an unknown density contains more than one mode. While the tests can be easily run in R, they are not included in SAS software. We provide implementations of the Dip Test and Silverman Test as macros in the SAS software, capitalizing on the capability of SAS to execute R code internally. Descriptions of the macro parameters, installation steps, and sample macro calls are provided, along with an appendix for troubleshooting. We illustrate the use of the macros on data simulated from one or more Gaussian distributions as well as on the famous iris dataset.
62 Statistics
Full Text: DOI
[1] Bambach RK, Knoll AH, Wang SC.Origination, extinction, and mass depletions of marine diversity. Paleobiology. 2004;30(4):522-542. doi: 10.1666/0094-8373(2004)030<0522:OEAMDO>2.0.CO;2[Crossref], [Web of Science ®], [Google Scholar]
[2] Cox GW.Strategic voting equilibria under the single nontransferable vote. Am Pol Sci Rev. 1994;88(3):608-621. doi: 10.2307/2944798[Crossref], [Web of Science ®], [Google Scholar]
[3] Kousta ST, Vigliocco G, Vinson DP, et al. The representation of abstract words: why emotion matters. J Exp Psychol: Gen. 2011;140(1):14-34. doi: 10.1037/a0021446[Crossref], [PubMed], [Web of Science ®], [Google Scholar]
[4] Liu Y, Hayes DN, Nobel A, et al. Statistical significance of clustering for high-dimension, low-sample size data. J Am Stat Assoc. 2008;103(483):1281-1293. doi:10.1198/016214508000000454. [Taylor & Francis Online], [Web of Science ®], [Google Scholar] · Zbl 1205.62079
[5] Kalogeratos Argyris, Likas Aristidis.Dip-means: an incremental clustering method for estimating the number of clusters. In: Pereira F, Burges CJC, Bottou L, et al., editors. Advances in Neural Information Processing Systems 25. Red Hook (NY): Curran Associates, Inc; 2012. p. 2393-2401. Available from: http://papers.nips.cc/paper/4795-dip-means-an-incremental-clustering-method-for-estimating-the-number-of-clusters.pdf. [Google Scholar]
[6] Ahmed MO, Walther G.Investigating the multimodality of multivariate data with principal curves. Comput Stat Data Anal. 2012;56(12):4462-4469. Available from: http://www.sciencedirect.com/science/article/pii/S0167947312001028. doi: 10.1016/j.csda.2012.02.020[Crossref], [Web of Science ®], [Google Scholar] · Zbl 1255.62170
[7] Shahbaba M, Beheshti S. Efficient unimodality test in clustering by signature testing. In: 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP); May; 2014. p. 8282-8286. [Google Scholar]
[8] Hartigan JA, Hartigan P.The dip test of unimodality. Ann Stat. 1985;13:70-84. doi: 10.1214/aos/1176346577[Crossref], [Web of Science ®], [Google Scholar] · Zbl 0575.62045
[9] Silverman BW.Using kernel density estimates to investigate multimodality. J R Stat Soc Ser B (Methodological). 1981; 97-99. [Google Scholar]
[10] SAS/STAT software, version 9.4. Cary, NC; 2012. Available from: http://www.sas.com/. [Google Scholar]
[11] Müller DW, Sawitzki G.Excess mass estimates and tests for multimodality. J Am Stat Assoc. 1991;86(415):738-746. [Taylor & Francis Online], [Web of Science ®], [Google Scholar] · Zbl 0733.62040
[12] Minnotte MC. A test of mode existence with applications to multimodality [dissertation]. Rice University; 1993. [Google Scholar]
[13] Cheng MY, Hall P.Calibrating the excess mass and dip tests of modality. J R Stat Soc. 1998;60(3):579-589. doi: 10.1111/1467-9868.00141[Crossref], [Google Scholar] · Zbl 0909.62046
[14] Hall P, York M.On the calibration of silverman’s test for multimodality. Stat Sin. 2001;11:515-536. [Web of Science ®], [Google Scholar] · Zbl 1026.62047
[15] Schwaiger F., Holzmann H. Package which implements the silvermantest; 2013. Available from: https://www.mathematik.uni-marburg.de/∼stochastik/R_packages/. [Google Scholar]
[16] Maechler M. diptest: Hartigan’s dip test statistic for unimodality - corrected; 2016. R package version 0.75-7; Available from: https://CRAN.R-project.org/package=diptest. [Google Scholar]
[17] Ameijeiras-Alonso J, Rodríguez-Casal RM, Crujeiras A. Mode testing and exploring; 2018. R package version 1.1; Available from: https://CRAN.R-project.org/package=multimode. [Google Scholar] · Zbl 1420.62155
[18] SAS/IML^® 14.3 user’s guide. Cary, NC; 2017. Available from: http://support.sas.com/documentation/onlinedoc/iml/143/imlug.pdf. [Google Scholar]
[19] SAS^® 9.4 companion for windows: Fifth edition. Cary, NC; 2016. Available from: http://documentation.sas.com/api/collections/pgmsascdc/9.4_3.3/docsets/hostwin/content/hostwin.pdf. [Google Scholar]
[20] SAS^® 9.4 global statements: Reference. Cary, NC; 2017. Available from: http://documentation.sas.com/api/collections/pgmsascdc/9.4_3.3/docsets/lestmtsglobal/content/lestmtsglobal.pdf. [Google Scholar]
[21] SAS^® 9.3 macro language: Reference. Cary, NC; 2011. Available from: http://support.sas.com/documentation/cdl/en/mcrolref/62978/PDF/default/mcrolref.pdf. [Google Scholar]
[22] Fisher RA.The use of multiple measurements in taxonomic problems. Ann Eugen. 1936;7(2):179-188. doi: 10.1111/j.1469-1809.1936.tb02137.x[Crossref], [Google Scholar]
[23] Behboodian J.On the modes of a mixture of two normal distributions. Technometrics. 1970;12(1):131-139. doi: 10.1080/00401706.1970.10488640[Taylor & Francis Online], [Web of Science ®], [Google Scholar] · Zbl 0195.20304
[24] Jolliffe I. Principal component analysis. Springer; 2002. Springer Series in Statistics; Available from: https://books.google.com/books?id=_olByCrhjwIC. [Google Scholar] · Zbl 1011.62064
[25] Adolfsson A., Ackerman M., Brownstein N., et al. To cluster, or not to cluster: An analysis of clusterability methods. ArXiv e-prints. 2018. Available from: http://adsabs.harvard.edu/abs/2018arXiv180808317A. [Google Scholar]
[26] Kerstin J, Magnus L, Magnus F.What is a “unimodal” cell population? using statistical tests as criteria for unimodality in automated gating and quality control. Cytometry Part A. 2017;91(9):908-916.Available from: https://onlinelibrary.wiley.com/doi/abs/10.1002/cyto.a.23173. doi: 10.1002/cyto.a.23173[Crossref], [PubMed], [Google Scholar]
[27] Erez A, Vogel R, Mugler A, et al. Modeling of cytometry data in logarithmic space: when is a bimodal distribution not bimodal? bioRxiv. 2017; Available from: https://www.biorxiv.org/content/early/2017/06/14/150201. [Google Scholar]
[28] SAS^® 9.2 language reference: Dictionary, fourth edition. Cary, NC; 2011. Available from: http://support.sas.com/documentation/cdl/en/lrdict/64316/PDF/default/lrdict.pdf. [Google Scholar]
[29] Error in installation a r package; 2018. (archived at https://perma.cc/K5KZ-6ZA7); Available from: https://stackoverflow.com/questions/26570912/error-in-installation-a-r-package. [Google Scholar]
[30] SAS^® 9.4 language reference: Concepts, sixth edition. Cary, NC; 2016. Available from: http://documentation.sas.com/api/docsets/lrcon/9.4/content/lrcon.pdf. [Google Scholar]
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.