×

zbMATH — the first resource for mathematics

Co-eye: a multi-resolution ensemble classifier for symbolically approximated time series. (English) Zbl 07289219
Summary: Time series classification (TSC) is a challenging task that attracted many researchers in the last few years. One main challenge in TSC is the diversity of domains where time series data come from. Thus, there is no “one model that fits all” in TSC. Some algorithms are very accurate in classifying a specific type of time series when the whole series is considered, while some only target the existence/non-existence of specific patterns/shapelets. Yet other techniques focus on the frequency of occurrences of discriminating patterns/features. This paper presents a new classification technique that addresses the inherent diversity problem in TSC using a nature-inspired method. The technique is stimulated by how flies look at the world through “compound eyes” that are made up of thousands of lenses, called ommatidia. Each ommatidium is an eye with its own lens, and thousands of them together create a broad field of vision. The developed technique similarly uses different lenses and representations to look at the time series, and then combines them for broader visibility. These lenses have been created through hyper-parameterisation of symbolic representations (Piecewise Aggregate and Fourier approximations). The algorithm builds a random forest for each lens, then performs soft dynamic voting for classifying new instances using the most confident eyes, i.e., forests. We evaluate the new technique, coined Co-eye, using the recently released extended version of UCR archive, containing more than 100 datasets across a wide range of domains. The results show the benefits of bringing together different perspectives reflecting on the accuracy and robustness of Co-eye in comparison to other state-of-the-art techniques.
MSC:
68T05 Learning and adaptive systems in artificial intelligence
Software:
SFA ; DIRECT; SMOTE
PDF BibTeX XML Cite
Full Text: DOI
References:
[1] Bagnall, A., Lines, J., Vickers, W. V., & Keogh, E. The UEA and UCR time series classification repository. www.timeseriesclassification.com.
[2] Bagnall, A.; Lines, J.; Bostrom, A.; Large, J.; Keogh, E., The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances, Data Mining and Knowledge Discovery, 31, 3, 606-660 (2017)
[3] Bagnall, A.; Lines, J.; Hills, J.; Bostrom, A., Time-series classification with cote: The collective of transformation-based ensembles, IEEE Transactions on Knowledge and Data Engineering, 27, 9, 2522-2535 (2015)
[4] Baydogan, MG; Runger, G.; Tuv, E., A bag-of-features framework to classify time series, IEEE Transactions on Pattern Analysis and Machine Intelligence, 35, 11, 2796-2802 (2013)
[5] Bergstra, J.; Bengio, Y., Random search for hyper-parameter optimization, Journal of Machine Learning Research, 13, Feb, 281-305 (2012) · Zbl 1283.68282
[6] Chawla, N. V. (2009). Data mining for imbalanced datasets: An overview. In Data mining and knowledge discovery handbook (pp. 875-886). Berlin: Springer.
[7] Chawla, NV; Bowyer, KW; Hall, LO; Kegelmeyer, WP, Smote: Synthetic minority over-sampling technique, Journal of Artificial Intelligence Research, 16, 321-357 (2002) · Zbl 0994.68128
[8] Dau, H. A., Keogh, E., Kamgar, K., Yeh, C. C. M., Zhu, Y., Gharghabi, S., et al. (2018). The UCR time series classification archive. https://www.cs.ucr.edu/ eamonn/time_series_data_2018/.
[9] Demšar, J., Statistical comparisons of classifiers over multiple data sets, Journal of Machine Learning Research, 7, Jan, 1-30 (2006) · Zbl 1222.68184
[10] Deng, H.; Runger, G.; Tuv, E.; Vladimir, M., A time series forest for classification and feature extraction, Information Sciences, 239, 142-153 (2013) · Zbl 1321.62068
[11] Fawaz, HI; Forestier, G.; Weber, J.; Idoumghar, L.; Muller, PA, Deep learning for time series classification: A review, Data Mining and Knowledge Discovery, 33, 4, 917-963 (2019)
[12] Finkel, DE, Direct optimization algorithm user guide, Center for Research in Scientific Computation, North Carolina State University, 2, 1-14 (2003)
[13] Grabocka, J., Schilling, N., Wistuba, M., & Schmidt-Thieme, L. (2014). Learning time-series shapelets. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’14, pp. 392-401. ACM, New York. 10.1145/2623330.2623613.
[14] Ho, T. K. (1995). Random decision forests. In Proceedings of 3rd international conference on document analysis and recognition (vol. 1, pp. 278-282). 10.1109/ICDAR.1995.598994.
[15] Holland, JK; Kemsley, EK; Wilson, RH, Use of Fourier transform infrared spectroscopy and partial least squares regression for the detection of adulteration of strawberry purées, Journal of the Science of Food and Agriculture, 76, 2, 263-269 (1998)
[16] Keogh, E.; Chakrabarti, K.; Pazzani, M.; Mehrotra, S., Dimensionality reduction for fast similarity search in large time series databases, Knowledge and Information Systems, 3, 3, 263-286 (2001) · Zbl 0989.68039
[17] Keogh, E. J., & Pazzani, M. J. (2000). A simple dimensionality reduction technique for fast similarity search in large time series databases. In Proceedings of the 4th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Current Issues and New Applications, PADKK ’00 (pp. 122-133). Springer-Verlag: London.
[18] Li, S., Li, Y., & Fu, Y. (2016). Multi-view time series classification: A discriminative bilinear projection approach. In Proceedings of the 25th ACM international on conference on information and knowledge management pp. (989-998). ACM.
[19] Lin, J.; Khade, R.; Li, Y., Rotation-invariant similarity in time series using bag-of-patterns representation, Journal of Intelligent Information Systems, 39, 2, 287-315 (2012)
[20] Lines, J.; Bagnall, A., Time series classification with ensembles of elastic distance measures, Data Mining and Knowledge Discovery, 29, 3, 565-592 (2015) · Zbl 1405.68295
[21] Lines, J., Taylor, S., & Bagnall, A. (2016). Hive-cote: The hierarchical vote collective of transformation-based ensembles for time series classification. In 2016 IEEE 16th international conference on data mining (ICDM) (pp. 1041-1046). IEEE. 10.1109/ICDM.2016.0133.
[22] Patel, P., Keogh, E., Lin, J., & Lonardi, S. (2002). Mining motifs in massive time series databases. In 2002 IEEE international conference on data mining, 2002. Proceedings (pp. 370-377). 10.1109/ICDM.2002.1183925.
[23] Rodriguez, JJ; Kuncheva, LI; Alonso, CJ, Rotation forest: A new classifier ensemble method, IEEE Transactions on Pattern Analysis and Machine Intelligence, 28, 10, 1619-1630 (2006)
[24] Schäfer, P., The BOSS is concerned with time series classification in the presence of noise, Data Mining and Knowledge Discovery, 29, 6, 1505-1530 (2015) · Zbl 1405.68305
[25] Schäfer, P., & Högqvist, M. (2012). SFA: A symbolic Fourier approximation and index for similarity search in high dimensional datasets. In Proceedings of the 15th international conference on extending database technology (pp. 516-527). ACM.
[26] Senin, P., & Malinchik, S. (2013). Sax-vsm: Interpretable time series classification using sax and vector space model. In 2013 IEEE 13th international conference on data mining (pp. 1175-1180). 10.1109/ICDM.2013.52.
[27] Silva, I., Behar, J., Sameni, R., Zhu, T., Oster, J., Clifford, G.D., et al. (2013). Noninvasive fetal ECG: The physionet/computing in cardiology challenge 2013. In Computing in cardiology 2013 (pp. 149-152). IEEE.
[28] Woźniak, M.; Graña, M.; Corchado, E., A survey of multiple classifier systems as hybrid systems, Information Fusion, 16, 3-17 (2014)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.