×

zbMATH — the first resource for mathematics

Benchmark and survey of automated machine learning frameworks. (English) Zbl 07328086
Summary: Machine learning (ML) has become a vital part in many aspects of our daily life. However, building well performing machine learning applications requires highly specialized data scientists and domain experts. Automated machine learning (AutoML) aims to reduce the demand for data scientists by enabling domain experts to build machine learning applications automatically without extensive knowledge of statistics and machine learning. This paper is a combination of a survey on current AutoML methods and a benchmark of popular AutoML frameworks on real data sets. Driven by the selected frameworks for evaluation, we summarize and review important AutoML techniques and methods concerning every step in building an ML pipeline. The selected AutoML frameworks are evaluated on 137 data sets from established AutoML benchmark suites.
MSC:
68T Artificial intelligence
PDF BibTeX XML Cite
Full Text: DOI
References:
[1] Alaa, A. M., & Van Der Schaar, M. (2018). AutoPrognosis: Automated Clinical Prognostic Modeling via Bayesian Optimization with Structured Kernel Learning.International Conference on Machine Learning,1, 139-148.
[2] Alia, S., & Smith-Miles, K. A. (2006).A meta-learning approach to automatic kernel selection for support vector machines.Neurocomputing,70(1-3), 173-186.
[3] Anderson, R. L. (1953). Recent Advances in Finding Best Operating Conditions.Journal of the American Statistical Association,48(264), 789-798.
[4] Ayria, P. (2018). A complete Machine Learning PipeLine.. Available athttps://www. kaggle.com/pouryaayria/a-complete-ml-pipeline-tutorial-acu-86.
[5] Baidu (2018). EZDL.. Available athttp://ai.baidu.com/ezdl/.
[6] Balaji, A., & Allen, A. (2018). Benchmarking Automatic Machine Learning Frameworks. arXiv preprint arXiv:1808.06492.
[7] Banzhaf, W., Nordin, P., Keller, R. E., & Francone, F. D. (1997).Genetic Programming: An Introduction. Morgan Kaufmann. · Zbl 0893.68117
[8] Belotti, P., Kirches, C., Leyffer, S., Linderoth, J., Luedtke, J., & Mahajan, A. (2013). Mixed-integer nonlinear optimization.Acta Numerica,22, 1-131. · Zbl 1291.65172
[9] Bengio, Y., Courville, A., & Vincent, P. (2013). Representation Learning: A Review and New Perspectives.IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(8), 1798-1828.
[10] Bergstra, J., Bardenet, R., Bengio, Y., & K´egl, B. (2011). Algorithms for Hyper-Parameter Optimization. InInternational Conference on Neural Information Processing Systems, pp. 2546-2554.
[11] Bergstra, J., & Bengio, Y. (2012). Random Search for Hyper-Parameter Optimization. Journal of Machine Learning Research,13, 281-305. · Zbl 1283.68282
[12] Bergstra, J., Yamins, D., & Cox, D. D. (2013). Hyperopt: A python library for optimizing the hyperparameters of machine learning algorithms. InPython in Science Conference, pp. 13-20.
[13] Bilalli, B., Abell´o, A., & Aluja-Banet, T. (2017). On the Predictive Power of Meta-Features in OpenML.International Journal of Applied Mathematics and Computer Science, 27(4), 697-712.
[14] Bischl, B., Casalicchio, G., Feurer, M., Hutter, F., Lang, M., Mantovani, R. G., van Rijn, J. N., & Vanschoren, J. (2017). OpenML Benchmarking Suites and the OpenML100. arXiv preprint arXiv:1708.03731v1.
[15] Bischl, B., Casalicchio, G., Feurer, M., Hutter, F., Lang, M., Mantovani, R. G., van Rijn, J. N., & Vanschoren, J. (2019).OpenML Benchmarking Suites.arXiv preprint arXiv:1708.03731v2. arXiv:1708.03731.
[16] Bottou, L. (2012). Stochastic Gradient Descent Tricks. InNeural Networks, Tricks of the Trade, Reloaded, pp. 430-445. Springer.
[17] Breiman, L. (2001). Random Forests.Machine Learning,45(1), 5-32. · Zbl 1007.68152
[18] Breiman, L., Friedman, J., Stone, C. J., & Olsen, R. (1984).Classification and Regression Trees. Chapman and Hall. · Zbl 0541.62042
[19] Brochu, E., Cora, V. M., & de Freitas, N. (2010). A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning.arXiv preprint arXiv:1012.2599.
[20] Browne, C., Powley, E., Whitehouse, D., Lucas, S., Member, S., Cowling, P. I., Rohlfshagen, P., Tavener, S., Perez, D., Samothrakis, S., & Colton, S. (2012). A Survey of Monte Carlo Tree Search Methods.IEEE Transactions on Computational Intelligence and AI in Games,4(1), 1-49.
[21] Buyya, R. (1999).High Performance Cluster Computing: Architectures and Systems, Vol. 1. Prentice Hall.
[22] Chan, T. (2017). Advisor.. Available athttps://github.com/tobegit3hub/advisor.
[23] Chapelle, O., Vapnik, V., Bousquet, O., & Mukherjee, S. (2002). Choosing Multiple Parameters for Support Vector Machines.Machine Learning,46, 131-159. · Zbl 0998.68101
[24] Chen, B., Wu, H., Mo, W., Chattopadhyay, I., & Lipson, H. (2018). Autostacker: A Compositional Evolutionary Learning System. InGenetic and Evolutionary Computation Conference, pp. 402-409.
[25] Chen, P.-W., Wang, J.-Y., & Lee, H.-M. (2004). Model selection of SVMs using GA approach. InIEEE International Joint Conference on Neural Networks.
[26] Chu, X., Ilyas, I. F., Krishnan, S., & Wang, J. (2016).Data Cleaning: Overview and Emerging Challenges. InInternational Conference on Management of Data, pp. 2201- 2206.
[27] Chu, X., Morcos, J., Ilyas, I. F., Ouzzani, M., Papotti, P., Tang, N., & Ye, Y. (2015). KATARA: A Data Cleaning System Powered by Knowledge Bases and Crowdsourcing. InACM International Conference on Management of Data, pp. 1247-1261.
[28] Claesen, M., Simm, J., Popovic, D., Moreau, Y., & De Moor, B. (2014). Easy Hyperparameter Search Using Optunity.arXiv preprint arXiv: 1412.1114.
[29] Clouder,A.(2018).ShorteningMachineLearningDevelopmentCycle withAutoML..Availableathttps://www.alibabacloud.com/blog/ shortening-machine-learning-development-cycle-with-automl_594232.
[30] Coello, C. A. C., Lamont, G. B., & Van Veldhuizen, D. A. (2007).Evolutionary Algorithms for Solving Multi-Objective Problems, Vol. 5. Springer. · Zbl 1142.90029
[31] Das, P., Ivkin, N., Bansal, T., Rouesnel, L., Gautier, P., Karnin, Z., Dirac, L., Ramakrishnan, L., Perunicic, A., Shcherbatyi, I., Wu, W., Zolic, A., Shen, H., Ahmed, A., Winkelmolen, F., Miladinovic, M., Archembeau, C., Tang, A., Dutt, B., Grao, P., & Venkateswar, K. (2020). Amazon SageMaker Autopilot: a white box AutoML solution at scale Piali. InData Management for End-to-End Machine Learning, pp. 1-7.
[32] das Dˆores, S. C. N., Soares, C., & Ruiz, D. (2018). Bandit-Based Automated Machine Learning. InBrazilian Conference on Intelligent Systems.
[33] Dash, M., & Liu, H. (1997). Feature Selection for Classification.Intelligent Data Analysis, 1, 131-156.
[34] De Miranda, P. B., Prudˆencio, R. B., De Carvalho, A. C. P., & Soares, C. (2012). An Experimental Study of the Combination of Meta-Learning with Particle Swarm Algorithms for SVM Parameter Selection.International Conference on Computational Science and Its Applications, pp. 562-575.
[35] de S´a, A. G. C., Pinto, W. J. G. S., Oliveira, L. O. V. B., & Pappa, G. L. (2017). RECIPE: A Grammar-Based Framework for Automatically Evolving Classification Pipelines. In European Conference on Genetic Programming, Vol. 10196, pp. 246-261.
[36] Dean, J., & Ghemawat, S. (2008). MapReduce: Simplified Data Processing on Large Clusters.Communications of the ACM,51(1), 107-113.
[37] Desautels, T., Krause, A., & Burdick, J. W. (2014). Parallelizing Exploration-Exploitation Tradeoffs with Gaussian Process Bandit Optimization.Journal of Machine Learning Research,15, 4053-4103. · Zbl 1312.62036
[38] Dinsmore, T. (2016). Automated Machine Learning: A Short History.. Available athttps: //blog.datarobot.com/automated-machine-learning-short-history.
[39] Domhan, T., Springenberg, J. T., & Hutter, F. (2015). Speeding up Automatic Hyperparameter Optimization of Deep Neural Networks by Extrapolation of Learning Curves. International Joint Conference on Artificial Intelligence, pp. 3460-3468.
[40] Dor, O., & Reich, Y. (2012). Strengthening learning algorithms by feature discovery.Information Sciences,189, 176-190.
[41] Doshi-Velez, F., & Kim, B. (2017). Towards A Rigorous Science of Interpretable Machine Learning.arXiv preprint arXiv:1702.08608.
[42] Drori, I., Krishnamurthy, Y., de Paula Lourenco, R., Rampin, R., Kyunghyun, C., Silva, C., & Freire, J. (2019). Automatic Machine Learning by Pipeline Synthesis using Model-Based Reinforcement Learning and a Grammar. InInternational Conference on Machine Learning AutoML Workshop.
[43] Drori, I., Krishnamurthy, Y., Rampin, R., Lourenco, R. d. P., Ono, J. P., Cho, K., Silva, C., & Freire, J. (2018). AlphaD3M : Machine Learning Pipeline Synthesis. InInternational Conference on Machine Learning AutoML Workshop.
[44] Eduardo, S., & Sutton, C. (2016). Data Cleaning using Probabilistic Models of Integrity Constraints. InNeural Information Processing Systems.
[45] Efimova, V., Filchenkov, A., & Shalamov, V. (2017). Fast Automated Selection of Learning Algorithm And its Hyperparameters by Reinforcement Learning. InInternational Conference on Machine Learning AutoML Workshop.
[46] Eggensperger, K., Feurer, M., Hutter, F., Bergstra, J., Snoek, J., Hoos, H., & Leyton-Brown, K. (2013). Towards an Empirical Foundation for Assessing Bayesian Optimization of Hyperparameters. InNIPS Workshop on Bayesian Optimization in Theory and Practice.
[47] Eggensperger, K., Hutter, F., Hoos, H. H., & Leyton-Brown, K. (2015). Efficient Benchmarking of Hyperparameter Optimizers via Surrogates. InAAAI Conference on Artificial Intelligence, pp. 1114-1120.
[48] Eggensperger, K., Lindauer, M. T., Hoos, H. H., Hutter, F., & Leyton-Brown, K. (2018). Efficient Benchmarking of Algorithm Configuration Procedures via Model-Based Surrogates.Machine Learning,107, 15-41. · Zbl 1457.68341
[49] Elshawi, R., Maher, M., & Sakr, S. (2019). Automated Machine Learning: State-of-The-Art and Open Challenges.arXiv preprint arXiv:1906.02287.
[50] Escalante, H. J., Montes, M., & Luis, V. (2009). Particle Swarm Model Selection for Authorship Verificatio.Iberoamerican Congress on Pattern Recognition, pp. 563-570.
[51] Fabris, F., & Freitas, A. A. (2019). Analysing the Overfit of the auto-sklearn Automated Machine Learning Tool. InMachine Learning, Optimization, and Data Science, Vol. 11943, pp. 508-520. Springer International Publishing.
[52] Falkner, S., Klein, A., & Hutter, F. (2018). BOHB: Robust and Efficient Hyperparameter Optimization at Scale. InInternational Conference on Machine Learning, pp. 1437- 1446.
[53] Fern´andez-Godino, M. G., Park, C., Kim, N.-H., & Haftka, R. T. (2016). Review of multifidelity models.arXiv preprint arXiv:1609.07196.
[54] Feurer, M., Eggensperger, K., Falkner, S., Lindauer, M., & Hutter, F. (2018). Practical Automated Machine Learning for the AutoML Challenge 2018.International Conference on Machine Learning AutoML Workshop.
[55] Feurer, M., & Hutter, F. (2018). Towards Further Automation in AutoML. InInternational Conference on Machine Learning AutoML Workshop.
[56] Feurer, M., Klein, A., Eggensperger, K., Springenber, J. T., Blum, M., & Hutter, F. (2015a). Efficient and Robust Automated Machine Learning. InInternational Conference on Neural Information Processing Systems, pp. 2755-2763.
[57] Feurer, M., Springenberg, J. T., & Hutter, F. (2015b). Initializing Bayesian Hyperparameter Optimization via Meta-Learning.National Conference on Artificial Intelligence, pp. 1128-1135.
[58] Frazier, P. I. (2018).A Tutorial on Bayesian Optimization.arXiv preprint arXiv: 1807.02811, pp. 1-22.
[59] Friedman, L., & Markovitch, S. (2015). Recursive Feature Generation for Knowledge-based Learning.Journal of Artificial Intelligence Research,1, 3-17.
[60] Fukunaga, K., & Hostetler, L. D. (1975).The estimation of the gradient of a density function, with applications in pattern recognition.IEEE Transactions on Information Theory,21(1), 32-40. · Zbl 0297.62025
[61] Galhardas, H., Florescu, D., Shasha, D., & Simon, E. (2000). AJAX:An Extensible Data Cleaning Tool. InInternational Conference on Management of Data, pp. 590-596.
[62] Gama, J., & Brazdil, P. (2000). Characterization of Classification Algorithms. InPortuguese Conference on Artificial Intelligence.
[63] Garrido-Merch´an, E. C., & Hern´andez-Lobato, D. (2018). Dealing with Integer-valued Variables in Bayesian Optimization with Gaussian Processes. InInternational Conference on Machine Learning AutoML Workshop, pp. 1-18.
[64] Gaudel, R., & Sebag, M. (2010). Feature Selection as a One-Player Game. InInternational Conference on Machine Learning, pp. 359-366.
[65] Geiger, A., Lenz, P., & Urtasun, R. (2012). Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite. InConference on Computer Vision and Pattern Recognition.
[66] Ghallab, M., Nau, D., & Traverso, P. (2004).Automated Planmning: Theory & Praxis. Morgan Kaufmann Publishers, Inc. · Zbl 1074.68613
[67] Gijsbers, P., LeDell, E., Thomas, J., Poirier, S., Bischl, B., & Vanschoren, J. (2019). An Open Source AutoML Benchmark. InInternational Conference on Machine Learning AutoML Workshop.
[68] Gil, Y., Honaker, J., Gupta, S., Ma, Y., Orazio, V. D., Garijo, D., Gadewar, S., Yang, Q., & Jahanshad, N. (2019). Towards Human-Guided Machine Learning. InInternational Conference on Intelligent User Interfaces.
[69] Gil, Y., Yao, K.-T., Ratnakar, V., Garijo, D., Steeg, G. V., Szekely, P., Brekelmans, R., Kejriwal, M., Luo, F., & Huang, I.-H. (2018). P4ML: A Phased Performance-Based Pipeline Planner for Automated Machine Learning. InInternational Conference on Machine Learning AutoML Workshop, pp. 1-8.
[70] Ginsbourger, D., Janusevskis, J., & Le Riche, R. (2010a). Dealing with asynchronicity in parallel Gaussian process based global optimization.. Available athttps://hal. archives-ouvertes.fr/hal-00507632.
[71] Ginsbourger, D., Le Riche, R., & Carraro, L. (2010b). Kriging Is Well-Suited to Parallelize Optimization. InComputational Intelligence in Expensive Optimization Problems, pp. 131-162. Springer Berlin Heidelberg.
[72] Golovin, D., Solnik, B., Moitra, S., Kochanski, G., Karro, J., & Sculley, D. (2017). Google Vizier: A Service for Black-Box Optimization. InACM International Conference on Knowledge Discovery and Data Mining, pp. 1487-1495.
[73] Gomes, T. A., Prudˆcncio, R. B., Soares, C., Rossi, A. L., & Carvalho, A. (2012). Combining Meta-Learning and Search Techniques to Select Parameters for Support Vector Machines.Neurocomputing,75(1), 3-13.
[74] Goodfellow, I., Bengio, Y., & Courville, A. (2016). Representation Learning. InDeep Learning, chap. 15. MIT Press. · Zbl 1373.68009
[75] Google LLC (2019). AI Explanations Whitepaper. Tech. rep., Google LLC.
[76] Gower, J. C. (1971). A General Coefficient of Similarity and Some of Its Properties.Biometrics,27(4), 857-871.
[77] Gustafson, L. (2018).Bayesian Tuning and Bandits : An Extensible , Open Source Library for AutoML by. Ph.D. thesis, Massachusetts Institute of Technology.
[78] Guyon, I., Bennett, K., Cawley, G., Escalante, H. J., Escalera, S., Ho, T. K., Maci´a, N., Ray, B., Saeed, M., Statnikov, A., & Viegas, E. (2015). Design of the 2015 ChaLearn AutoML Challenge.International Joint Conference on Neural Networks, pp. 1-8.
[79] Guyon, I., Chaabane, I., Escalante, H. J., Escalera, S., Jajetic, D., Lloyd, J. R., Maci´a, N., Ray, B., Romaszko, L., Sebag, M., Statnikov, A., Treguer, S., & Viegas, E. (2016). A brief Review of the ChaLearn AutoML Challenge: Any-time Any-dataset Learning without Human Intervention. InInternational Conference on Machine Learning AutoML Workshop, pp. 21-30.
[80] Guyon, I., & Elisseeff, A. (2003). An Introduction to Variable and Feature Selection.Journal of Machine Learning Research,3, 1157-1182. · Zbl 1102.68556
[81] Guyon, I., Saffari, A., Dror, G., & Cawley, G. (2008). Analysis of the IJCNN 2007 Agnostic Learning vs. Prior Knowledge Challenge.Neural Networks,21(2-3), 544-550. · Zbl 1254.68207
[82] Guyon, I., Sun-Hosoya, L., Boull´e, M., Escalante, H. J., Escalera, S., Liu, Z., Jajetic, D., Ray, B., Saeed, M., Sebag, M., Statnikov, A., Tu, W.-W., & Viegas, E. (2018). Analysis of the AutoML Challenge series 2015-2018. InAutomatic Machine Learning: Methods, Systems, Challenges. Springer Verlag.
[83] Guyon, I., Weston, J., & Barnhill, S. (2002). Gene Selection for Cancer Classification using Support Vector Machines.Machine Learning,46, 389-422. · Zbl 0998.68111
[84] H2O.ai (2018).H2O Driverless AI..Available athttps://www.h2o.ai/products/ h2o-driverless-ai/.
[85] H2O.ai (2019). H2O AutoML.. Available athttp://docs.h2o.ai/h2o/latest-stable/ h2o-docs/automl.html.
[86] He, X., Zhao, K., & Chu, X. (2019). AutoML: A Survey of the State-of-the-Art.arXiv preprint arXiv:1908.00709.
[87] Hellerstein, J. M. (2008). Quantitative Data Cleaning for Large Databases.United Nations Economic Commission for Europe.
[88] Hennig, P., & Schuler, C. J. (2012). Entropy Search for Information-Efficient Global Optimization.Journal of Machine Learning Research,13, 1809-1837. · Zbl 1432.65073
[89] Hesterman, J. Y., Caucci, L., Kupinski, M. A., Barrett, H. H., & Furenlid, L. R. (2010). Maximum-Likelihood Estimation With a Contracting-Grid Search Algorithm.IEEE Transactions on Nuclear Science,57(3), 1077-1084.
[90] Hoffman, M. W., Shahriari, B., & de Freitas, N. (2014). On correlation and budget constraints in model-based bandit optimization with application to automatic machine learning. InArtificial Intelligence and Statistics, pp. 365-374.
[91] Hsu, C.-W., Chang, C.-C., & Lin, C.-J. (2003).A Practical Guide to Support Vector Classification..
[92] Huberman, B. A., Lukose, R. M., & Hogg, T. (1997). An Economics Approach to Hard Computational Problems.Science,275(5296), 51-54.
[93] Hutter, F., Hoos, H. H., & Leyton-Brown, K. (2011). Sequential Model-Based Optimization for General Algorithm Configuration. InInternational Conference on Learning and Intelligent Optimization, pp. 507-523.
[94] Hutter, F., Hoos, H. H., & Leyton-Brown, K. (2012). Parallel algorithm configuration. In International Conference on Learning and Intelligent Optimization, Vol. 7219. · Zbl 1192.68831
[95] Hutter, F., Hoos, H. H., & Leyton-Brown, K. (2014). An Efficient Approach for Assessing Hyperparameter Importance. InInternational Conference on Machine Learning, pp. 754-762.
[96] Hutter, F., Hoos, H. H., Leyton-Brown, K., & St¨utzle, T. (2009). ParamILS: An Automatic Algorithm Configuration Framework.Journal of Artificial Intelligence Research,36, 267-306. · Zbl 1192.68831
[97] Hutter, F., Kotthoff, L., & Vanschoren, J. (2018a).Automated Machine Learning: Methods, Systems, Challenges. Springer.
[98] Hutter, F., Kotthoff, L., & Vanschoren, J. (2018b). Hyperparameter Optimization. In Automatic Machine Learning: Methods, Systems, Challenges, pp. 3-38. Springer.
[99] Jamieson, K., & Talwalkar, A. (2015). Non-stochastic Best Arm Identification and Hyperparameter Optimization. InArtificial Intelligence and Statistics, pp. 240-248.
[100] Jeffery, S. R., Alonso, G., Franklin, M. J., Hong, W., & Widom, J. (2006). Declarative Support for Sensor Data Cleaning. InInternational Conference on Pervasive Computing, pp. 83-100.
[101] Kandasamy, K., Krishnamurthy, A., Schneider, J., & P´oczos, B. (2018).Parallelised Bayesian Optimisation via Thompson Sampling Kirthevasan. InInternational Conference on Artificial Intelligence and Statistics, pp. 133-142.
[102] Kanter, J. M., & Veeramachaneni, K. (2015). Deep Feature Synthesis: Towards Automating Data Science Endeavors. InIEEE International Conference on Data Science and Advanced Analytics, pp. 1-10.
[103] Katz, G., Shin, E. C. R., & Song, D. (2017). ExploreKit: Automatic feature generation and selection. InIEEE International Conference on Data Mining, pp. 979-984.
[104] Kaul, A., Maheshwary, S., & Pudi, V. (2017). AutoLearn - Automated Feature Generation and Selection. InIEEE International Conference on Data Mining.
[105] K´egl, B. (2017).How to Build a Data Science Pipeline..Available athttps://www. kdnuggets.com/2017/07/build-data-science-pipeline.html.
[106] Kennedy, J., & Eberhart, R. (1995). Particle Swarm Optimization. InInternational Conference on Neural Networks, pp. 1942-1948.
[107] Khayyaty, Z., Ilyasz, I. F., Jindal, A., Madden, S., Ouzzani, M., Papotti, P., Quian´e-Ruiz, J. A., Tang, N., & Yin, S. (2015). BigDansing: A System for Big Data Cleansing. In ACM International Conference on Management of Data, pp. 1215-1230.
[108] Khurana, U., Samulowitz, H., & Turaga, D. (2018a). Ensembles with Automated Feature Engineering. InInternational Conference on Machine Learning AutoML Workshop.
[109] Khurana, U., Samulowitz, H., & Turaga, D. (2018b). Feature Engineering for Predictive Modeling Using Reinforcement Learning. InAAAI Conference on Artificial Intelligence, pp. 3407-3414.
[110] Khurana, U., Turaga, D., Samulowitz, H., & Parthasrathy, S. (2016). Cognito: Automated Feature Engineering for Supervised Learning. InIEEE International Conference on Data Mining, pp. 1304-1307.
[111] Klein, A., Falkner, S., Bartels, S., Hennig, P., & Hutter, F. (2016). Fast Bayesian Optimization of Machine Learning Hyperparameters on Large Datasets. InArtificial Intelligence and Statistics, pp. 528-536. · Zbl 1421.62027
[112] Klein, A., Falkner, S., Mansur, N., & Hutter, F. (2017a). RoBO: A Flexible and Robust Bayesian Optimization Framework in Python. InNIPS Bayesian Optimization Workshop.
[113] Klein, A., Falkner, S., Springenberg, J. T., & Hutter, F. (2017b). Learning Curve Prediction With Bayesian Neural Networks.International Conference on Learning Representations, pp. 1-16.
[114] Koch, P., Golovidov, O., Gardner, S., Wujek, B., Griffin, J., & Xu, Y. (2018). Autotune: A Derivative-free Optimization Framework for Hyperparameter Tuning.InACM International Conference on Knowledge Discovery and Data Mining, pp. 443-452.
[115] Kocsis, L., & Szepesv´ari, C. (2006). Bandit based Monte-Carlo Planning. InEuropean Conference on Machine Learning, pp. 282-293.
[116] Kohavi, R., & John, G. H. (1995). Automatic Parameter Selection by Minimizing Estimated Error. InInternational Conference on Machine Learning, pp. 304-312.
[117] Komer, B., Bergstra, J., & Eliasmith, C. (2014). Hyperopt-Sklearn: Automatic Hyperparameter Configuration for Scikit-Learn. InInternational Conference on Machine Learning AutoML Workshop, pp. 2825-2830.
[118] Kononenko, I. (1994). Estimating attributes: Analysis and extensions of RELIEF.European Conference on Machine Learning.
[119] Kotthoff, L., Thornton, C., Hoos, H. H., Hutter, F., & Leyton-Brown, K. (2016). AutoWEKA 2.0: Automatic model selection and hyperparameter optimization in WEKA. Journal of Machine Learning Research,17, 1-5.
[120] Koza, J. R. (1992).Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press. · Zbl 0850.68161
[121] Krishnan, S., Wang, J., Franklin, M. J., Goldberg, K., Kraska, T., Milo, T., & Wu, E. (2015). SampleClean: Fast and Reliable Analytics on Dirty Data.IEEE Data Engineering Bulletin,38(3), 59-75.
[122] Krishnan, S., Wang, J., Wu, E., Franklin, M. J., & Goldberg, K. (2016). ActiveClean: Interactive Data Cleaning For Statistical Modeling.InProceedings of the VLDB Endowment, Vol. 12, pp. 948-959.
[123] Krishnan, S., & Wu, E. (2019).AlphaClean: Automatic Generation of Data Cleaning Pipelines.arXiv preprint arXiv:1904.11827.
[124] Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). ImageNet Classification with Deep Convolutional Neural Networks. InInternational Conference on Neural Information Processing Systems, Vol. 1, pp. 1097-1105.
[125] Lacoste, A., Larochelle, H., Marchand, M., & Laviolette, F. (2014). Sequential Model-Based Ensemble Optimization. InUncertainty In Artificial Intelligence, pp. 440-448.
[126] Lake, B. M., Ullman, T. D., Tenenbaum, J. B., & Gershman, S. J. (2017). Building machines that learn and think like people.Behavioral and Brain Sciences,40, 1-58.
[127] Lam, H. T., Thiebaut, J.-M., Sinn, M., Chen, B., Mai, T., & Alkan, O. (2017). One button machine for automating feature engineering in relational databases.arXiv preprint arXiv:1706.00327.
[128] Langevin, S., Jonker, D., Bethune, C., Coppersmith, G., Hilland, C., Morgan, J., Azunre, P., & Gawrilow, J. (2018). Distil: A Mixed-Initiative Model Discovery System for Subject Matter Experts. InInternational Conference on Machine Learning AutoML Workshop.
[129] LaValle, S. M., Branicky, M. S., & Lindemann, S. R. (2004). On the Relationship Between Classical Grid Search and Probabilistic Roadmaps.The International Journal of Robotics Research,23, 673-692.
[130] Levenshtein, V. I. (1966). Binary codes capable of correcting deletions, insertions, and reversals.Soviet physics doklady,10(8), 707-710.
[131] Levesque, J. C., Durand, A., Gagne, C., & Sabourin, R. (2017). Bayesian Optimization for Conditional Hyperparameter Spaces. InInternational Joint Conference on Neural Networks, pp. 286-293.
[132] Li, L., Jamieson, K., Rostamizadeh, A., Gonina, E., Hardt, M., Recht, B., & Talwalkar, A. (2020). A System for Massively Parallel Hyperparameter Tuning. InMachine Learning and Systems.
[133] Li, L., Jamieson, K. G., DeSalvo, G., Rostamizadeh, A., & Talwalkar, A. (2016). Efficient Hyperparameter Optimization and Infinitely Many Armed Bandits.arXiv preprint arXiv:1603.06560. · Zbl 06982941
[134] Li, L., Jamieson, K. G., DeSalvo, G., Rostamizadeh, A., & Talwalkar, A. (2018). Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization.Journal of Machine Learning Research,18, 1-52. · Zbl 06982941
[135] Lindauer, M., & Hutter, F. (2018). Warmstarting of Model-based Algorithm Configuration. InAAAI Conference on Artificial Intelligence, pp. 1355-1362.
[136] Luo, G. (2016). A Review of Automatic Selection Methods for Machine Learning Algorithms and Hyper- parameter Values.Network Modeling Analysis in Health Informatics and Bioinformatics,5(1), 1-15.
[137] Maclaurin, D., Duvenaud, D., & Adams, R. P. (2015). Gradient-based Hyperparameter Optimization through Reversible Learning. InInternational Conference on Machine Learning, pp. 2113-2122.
[138] Margaritis, D. (2009). Toward Provably Correct Feature Selection in Arbitrary Domains. InNeural Information Processing Systems, pp. 1240-1248.
[139] Markovitch, S., & Rosenstein, D. (2002). Feature generation using general constructor functions.Machine Learning,49(1), 59-98. · Zbl 1014.68068
[140] Maron, O., & Moore, A. (1993). Hoeffding Races: Accelerating Model Selection Search for Classification and Function Approximation.Advances in Neural Information Processing Systems, pp. 59-66.
[141] McGushion, H. (2019).HyperparameterHunter..Available athttps://github.com/ HunterMcGushion/hyperparameter_hunter.
[142] Meinshausen, N., & B¨uhlmann, P. (2010). Stability selection.Journal of the Royal Statistical Society,72(4), 417-473.
[143] Mej´ıa-Lavalle, M., Sucar, E., & Arroyo, G. (2006). Feature Selection With A Perceptron Neural Net. InInternational Workshop on Feature Selection for Data Mining, pp. 131-135.
[144] Messaoud, I. B., El Abed, H., M¨argner, V., & Amiri, H. (2011). A design of a preprocessing framework for large database of historical documents. InWorkshop on Historical Document Imaging and Processing, pp. 177-183.
[145] Mohr, F., Wever, M., & H¨ullermeier, E. (2018). ML-Plan: Automated machine learning via hierarchical planning.Machine Learning,107, 1495-1515. · Zbl 06990191
[146] Momma, M., & Bennett, K. P. (2002). A Pattern Search Method for Model Selection of Support Vector Regression. InSIAM International Conference on Data Mining, pp. 261-274.
[147] Motoda, H., & Liu, H. (2002). Feature Selection, Extraction and Construction.Communication of Institute of Information and Computing Machinery,5, 67-72.
[148] Munos, R. (2014). From Bandits to Monte-Carlo Tree Search: The Optimistic Principle Applied to Optimization and Planning. Tech. rep., hal-00747575. · Zbl 1296.91086
[149] Nargesian, F., Samulowitz, H., Khurana, U., Khalil, E. B., & Turaga, D. (2017). Learning Feature Engineering for Classification. InInternational Joint Conference on Artificial Intelligence, pp. 2529-2535.
[150] Nguyen, T.-D., Maszczyk, T., Musial, K., Z¨oller, M.-A., & Gabrys, B. (2020). AVATAR - Machine Learning Pipeline Evaluation Using Surrogate Model. InInternational Symposium on Intelligent Data Analysis, pp. 352-365.
[151] Nickson, T., Osborne, M. A., Reece, S., & Roberts, S. (2014). Automated Machine Learning on Big Data using Stochastic Algorithm Tuning.arXiv preprint arXiv: 1407.7969.
[152] Olson, R. S., Bartley, N., Urbanowicz, R. J., & Moore, J. H. (2016).Evaluation of a Tree-based Pipeline Optimization Tool for Automating Data Science. InGenetic and Evolutionary Computation Conference, pp. 485-492.
[153] Olson, R. S., & Moore, J. H. (2016). TPOT : A Tree-based Pipeline Optimization Tool for Automating Machine Learning. InInternational Conference on Machine Learning AutoML Workshop, pp. 66-74.
[154] Olson, R. S., Urbanowicz, R. J., Andrews, P. C., Lavender, N. A., Kidd, L. C., & Moore, J. H. (2016). Automating biomedical data science through tree-based pipeline optimization. InApplications of Evolutionary Computation, pp. 123-137. Springer International Publishing.
[155] Opitz, D., & Maclin, R. (1999). Popular Ensemble Methods: An Empirical Study.Journal of Artificial Intelligence Research,11, 169-198. · Zbl 0924.68159
[156] Parry, P. (2019). auto ml.. Available athttps://github.com/ClimbsRocks/auto_ml.
[157] Parzen, E. (1961). On Estimation of a Probability Density Function and Mode.The Annals of Mathematical Statistics,33(3), 1065-1076. · Zbl 0116.11302
[158] Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., & Duchesnay, E. (2011). Scikit-learn: Machine Learning in Python.Journal of Machine Learning Research,12, 2825-2830. · Zbl 1280.68189
[159] Pedregosa, F. (2016). Hyperparameter optimization with approximate gradient. InInternational Conference on Machine Learning, pp. 737-746.
[160] Perrone, V., Shen, H., Seeger, M., Archambeau, C., & Jenatton, R. (2019). Learning search spaces for Bayesian optimization: Another view of hyperparameter transfer learning. InAdvances in Neural Information Processing Systems 32, pp. 12771—-12781. Curran Associates, Inc.
[161] Petri, C. A. (1962).Kommunikation mit Automaten. Ph.D. thesis, Universit¨at Hamburg.
[162] Poli, R., Langdon, W. B., McPhee, N. F., & Koza, J. R. (2008).A Field Guide to Genetic Programing. Lulu.com.
[163] Polikar, R. (2006). Ensemble Based Systems in Decision Making.IEEE Circuits and Systems Magazine,6(3), 21-45.
[164] Post, M. J., van der Putten, P., & van Rijn, J. N. (2016). Does Feature Selection Improve Classification? A Large Scale Experiment in OpenML. InAdvances in Intelligent Data Analysis XV, pp. 158-170.
[165] Press,G.(2016).DataScientistsSpendMostofTheirTimeCleaningData..Availableathttps://whatsthebigdata.com/2016/05/01/ data-scientists-spend-most-of-their-time-cleaning-data/.
[166] Probst, P., Boulesteix, A.-L., & Bischl, B. (2019). Tunability: Importance of Hyperparameters of Machine Learning Algorithms.Journal of Machine Learning Research,20(53), 1-32. · Zbl 07049772
[167] Pudil, P., Novoviˇcov´a, J., & Kittler, J. (1994). Floating search methods in feature selection. Pattern recognition letters,15(11), 1119-1125.
[168] Pyle, D. (1999).Data Preparation for Data Mining. Morgan Kaufmann Publishers, Inc.
[169] Quanming, Y., Mengshuo, W., Hugo, J. E., Isabelle, G., Yi-Qi, H., Yu-Feng, L., Wei-Wei, T., Qiang, Y., & Yang, Y. (2018). Taking Human out of Learning Applications: A Survey on Automated Machine Learning.arXiv preprint arXiv:1810.13306.
[170] Rahm, E., & Do, H. H. (2000). Data cleaning: Problems and Current Approaches. InIEEE Data Engineering Bulletin.
[171] Rakotoarison, H., Schoenauer, M., & Sebag, M. (2019). Automated Machine Learning with Monte-Carlo Tree Search. InInternational Joint Conference on Artificial Intelligence, pp. 3296-3303.
[172] Rakotomamonjy, A. (2003). Variable selection using SVM-based criteria.Journal of Machine Learning Research,3, 1357-1370. · Zbl 1102.68583
[173] Raman, V., & Hellerstein, J. M. (2001). Potter’s Wheel: An Interactive Data Cleaning System. InInternational Conference on Very Large Data Bases, Vol. 1, pp. 381-390.
[174] RapidMiner (2018).Introducing RapidMiner Auto Model..Available athttps:// rapidminer.com/resource/automated-machine-learning/.
[175] Rasmussen, C. E., & Williams, C. K. I. (2006).Gaussian Processes for Machine Learning. The MIT Press. · Zbl 1177.68165
[176] Ratcliff, J. W., & Metzener, D. E. (1988). Pattern Matching: The Gestalt Approach.Dr Dobbs Journal,13(7), 46-72.
[177] Reif, M., Shafait, F., & Dengel, A. (2012). Meta-learning for evolutionary parameter optimization of classifier.Machine Learning,87, 357-380.
[178] Rekatsinas, T., Chuy, X., Ilyasy, I. F., & R´e, C. (2017). HoloClean: Holistic Data Repairs with Probabilistic Inference. InVLDB Endowment, pp. 1190-1201.
[179] Reynolds, C. W. (1987). Flocks, Herds, and Schools: A Distributed Behavioral Model. Computer Graphics,21(4), 25-34.
[180] Robbins, H. (1952). Some Aspects of the Sequential Design of Experiments.Bulletin of the American Mathematical Society,58(5), 527-535. · Zbl 0049.37009
[181] Rokach, L. (2010). Ensemble-based classifiers.Artificial Intelligence Review,33(1-2), 1-39.
[182] Rousseeuw, P. J. (1987). Silhouettes: A graphical aid to the interpretation and validation of cluster analysis.Journal of Computational and Applied Mathematics,20, 53-65. · Zbl 0636.62059
[183] Saeys, Y., Inza, I., & Larra˜naga, P. (2007). A review of feature selection techniques in bioinformatics.Bioinformatics,23(19), 2507-2517.
[184] Salvador, M. M., Budka, M., & Gabrys, B. (2016). Towards automatic composition of multicomponent predictive systems. InInternational Conference on Hybrid Artificial Intelligence Systems, pp. 27-39.
[185] Salvador, M. M., Budka, M., & Gabrys, B. (2017). Modelling multi-component predictive systems as petri nets. InIndustrial Simulation Conference, pp. 17-23.
[186] Samanta, B. (2004). Gear fault detection using artificial neural networks and support vector machines with genetic algorithms.Mechanical Systems and Signal Processing,18(3), 625-644.
[187] Schoenfeld, B., Giraud-Carrier, C., Poggemann, M., Christensen, J., & Seppi, K. (2018). Preprocessor Selection for Machine Learning Pipelines. InInternational Conference on Machine Learning AutoML Workshop.
[188] Shahriari, B., Swersky, K., Wang, Z., Adams, R. P., & de Freitas, N. (2016). Taking the Human Out of the Loop: A Review of Bayesian Optimization.Proceedings of the IEEE,104(1), 148 - 175.
[189] Shearer, C. (2000). The CRISP-DM model: the new blueprint for data mining.Journal of Data Warehousing,5(4), 13-22.
[190] Silver, D., Hubert, T., Schrittwieser, J., Antonoglou, I., Lai, M., Guez, A., Lanctot, M., Sifre, L., Kumaran, D., Graepel, T., Lillicrap, T., Simonyan, K., & Hassabis, D. (2017). Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm.arXiv preprint arXiv:1712.01815. · Zbl 1433.68320
[191] Smith, M. G., & Bull, L. (2005). Genetic Programming with a Genetic Algorithm for Feature Construction and Selection.Genetic Programming and Evolvable Machines, 6(3), 265-281.
[192] Smith, M. J., Wedge, R., & Veeramachaneni, K. (2017). FeatureHub: Towards collaborative data science. InIEEE International Conference on Data Science and Advanced Analytics, pp. 590-600.
[193] Snoek, J., Larochelle, H., & Adams, R. P. (2012).Practical Bayesian Optimization of Machine Learning Algorithms. InAdvances in Neural Information Processing Systems, pp. 2951-2959.
[194] Snyman, J. A. (2005).Practical Mathematical Optimization: An introduction to basic optimization theory and classical and new gradient-based algorithms. Springer. · Zbl 1104.90003
[195] Sohn, S. Y. (1999). Meta analysis of classification algorithms for pattern recognition.IEEE Transactions on Pattern Analysis and Machine Intelligence,21(11), 1137-1144.
[196] Solis, F. J., & Wets, R. J.-B. (1981). Minimization By Random Search Techniques.Mathematics of Operations Research,6(1), 19-30. · Zbl 0502.90070
[197] Sondhi, P. (2009). Feature Construction Methods: A Survey.Sifaka. Cs. Uiuc. Edu,69, 70-71.
[198] Sparks, E. R., Talwalkar, A., Haas, D., Franklin, M. J., Jordan, M. I., & Kraska, T. (2015). Automating model search for large scale machine learning. InACM Symposium on Cloud Computing, pp. 368-380.
[199] Swearingen, T., Drevo, W., Cyphers, B., Cuesta-Infante, A., Ross, A., & Veeramachaneni, K. (2017). ATM: A distributed, collaborative, scalable system for automated machine learning. InIEEE International Conference on Big Data, pp. 151-162.
[200] Swersky, K., Snoek, J., & Adams, R. P. (2014). Freeze-Thaw Bayesian Optimization.arXiv preprint arXiv:1406.3896.
[201] Thornton, C., Hutter, F., Hoos, H. H., & Leyton-Brown, K. (2013). Auto-WEKA: Combined Selection and Hyperparameter Optimization of Classification Algorithms. InACM International Conference on Knowledge Discovery and Data Mining, pp. 847-855.
[202] Tran, B., Xue, B., & Zhang, M. (2016). Genetic programming for feature construction and selection in classification on high-dimensional data.Memetic Computing,8, 3-15.
[203] Tuggener, L., Amirian, M., Rombach, K., L¨orwald, S., Varlet, A., Westermann, C., & Stadelmann, T. (2019). Automated Machine Learning in Practice: State of the Art and Recent Results. InSwiss Conference on Data Science, pp. 31-36.
[204] Tuv, E., Borisov, A., Runger, G., & Torkkola, K. (2009). Feature Selection with Ensembles, Artificial Variables, and Redundancy Elimination.Journal of Machine Learning Research,10, 1341-1366. · Zbl 1235.62003
[205] USU Software AG (2018). Katana.. Available athttps://katana.usu.de/.
[206] Vafaie, H., & De Jong, K. (1992). Genetic Algorithms as a Tool for Feature Selection in Machine Learning. InInternational Conference on Tools with Artificial Intelligence, pp. 200-203.
[207] van Rijn, J. N., Abdulrahman, S. M., Brazdil, P., & Vanschoren, J. (2015). Fast Algorithm Selection Using Learning Curves. InInternational Symposium on Intelligent Data Analysis.
[208] van Rijn, J. N., & Hutter, F. (2018). Hyperparameter Importance Across Datasets. In International Conference on Knowledge Discovery and Data Mining, pp. 2367-2376.
[209] Vanschoren, J. (2019). Meta-Learning. InAutomatic Machine Learning: Methods, Systems, Challenges, pp. 35-61. Springer.
[210] Vanschoren, J., van Rijn, J. N., Bischl, B., & Torgo, L. (2014). OpenML: networked science in machine learning.ACM International Conference on Knowledge Discovery and Data Mining,15(2), 49-60.
[211] Weisz, G., Gyorgy, A., & Szepesvari, C. (2018). LeapsAndBounds: A Method for Approximately Optimal Algorithm Configuration. InInternational Conference on Machine Learning AutoML Workshop, pp. 5257-5265.
[212] Wever, M., Mohr, F., & H¨ullermeier, E. (2018). ML-Plan for Unlimited-Length Machine Learning Pipelines. InInternational Conference on Machine Learning AutoML Workshop.
[213] Wistuba, M., Schilling, N., & Schmidt-Thieme, L. (2015a). Hyperparameter Search Space Pruning - A New Component for Sequential Model-Based Hyperparameter Optimization. InJoint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 104-119.
[214] Wistuba, M., Schilling, N., & Schmidt-Thieme, L. (2015b). Learning Hyperparameter Optimization Initializations. InIEEE International Conference on Data Science and Advanced Analytics. · Zbl 1457.68242
[215] Wistuba, M., Schilling, N., & Schmidt-Thieme, L. (2017).Automatic Frankensteining: Creating Complex Ensembles Autonomously. InSIAM International Conference on Data Mining, pp. 741-749.
[216] Wolpert, D. H. (1992). Stacked Generalization.Neural Networks,5(2), 241-259.
[217] Yang, Y., & Pedersen, J. O. (1997). A Comparative Study on Feature Selection in Text Categorization.International Conference on Machine Learning,97, 412-420.
[218] Zhang, Y., Bahadori, M. T., Su, H., & Sun, J. (2016). FLASH: Fast Bayesian Optimization for Data Analytic Pipelines. InACM International Conference on Knowledge Discovery and Data Mining, pp. 2065-2074.
[219] Zhou,L.(2018).HowtoBuildaBetterMachineLearning Pipeline..Availableathttps://www.datanami.com/2018/09/05/ how-to-build-a-better-machine-learning-pipeline/
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.