by the AutoML Hannover Team
The year 2022 was an exciting year for us. So much happened: At the Leibniz University Hannover (LUH), we founded our new institute of Artificial Intelligence AI, in short LUH|AI; Marius got tenure and was promoted to full professor; The group is further growing with our new team members Alexander Tornede, Sarah Krebs and Helena Graf; We kicked off the ERC starting grant project on human-centered AutoML and the new AI service center; We co-organized the first AutoML conference and the second edition of the AutoML Fall school.
Of course, we also worked hard to get more insights into constructing better AutoML approaches. The following briefly provides an overview of our most important papers in 2022.
Core-AutoML
Auto-PyTorch Forecasting: efficient automated deep learning for time series forecasting
Time Series (TS) forecasting plays a crucial role in many business and industrial problems. However, non-expert users might need help to decide the optimal architecture of DNN and its hyperparameter for a given task. With the introduction of our new Auto-PyTorch-TS branch, this no longer becomes an issue. Auto-PyTorch-TS modularises a time series forecasting network so that all the components can be freely assembled to form new architectures. Additionally, Auto-PyTorch-TS inherits all the strengths of Auto-PyTorch: multi-fidelity optimization, ensemble construction, etc., which makes Auto-PyTorch-TS an efficient and robust Automated deep learning framework for time series forecasting.
MASIF: Meta-learned Algorithm Selection using Implicit Multi-Fidelity Optimization
For every new task, a data scientist must figure out which of the many algorithms available is the best for his application. Since the same algorithms have been applied many times on different applications, we can leverage meta-knowledge about these algorithms and their learning behavior to do this much more efficiently. Leveraging this experience, we bridge the classical Algorithm Selection paradigm, predominantly using dataset meta-features, and Multi-Fidelity Optimization. While the former is not sufficient to describe the data task, the latter is myopic. We combine the two to mitigate their weaknesses and combine their strengths. With our transformer-based model of MASIF, we can interpret partial learning curves of varying lengths from all of the candidate algorithms as evidence of the dataset’s topology and consider them jointly. We thereby enable data scientists by guiding their decision process with meta-knowledge such that they can actively decide how much computing they want to invest in evaluating an algorithm.
[Paper Workshop | Follow-up Paper under review]
Dynamic Algorithm Configuration and Bayesian Optimization
Bayesian Optimization (BO) is a sampling-efficient optimization algorithm, often used in HPO and AutoML. However, BO itself has many (meta-)hyperparameters, incl. the initial design, surrogate model (incl. its hyperparameters), and the acquisition function. Offline tuning of these meta-hyperparameters is often not feasible and very expensive. Therefore, we started to look into how we can dynamically adjust the meta-hyperparameters of BO on the fly while BO is solving a given problem. As a first step, we considered the acquisition function. We showed that switching between different acquisition functions can be beneficial in terms of anytime performance, and simple acquisition schedules can even be meta-learned. This opens up new research questions on designing efficient and robust Bayesian optimizers.
[Paper on Switching of Acquisition Functions | Paper on Predicting Schedules]
Interactive and Explainable AutoML
Human-Expert Priors
Developers of ML models and applications often have some intuition about good hyperparameter settings, e.g., the default settings of learning rates. These are only sometimes the best or even suitable, but in most cases, they are reasonable. Using this prior knowledge of developers, we designed an easy-to-implement approach for Bayesian Optimization enhanced by user priors on the optimal hyperparameter settings. The nice property of our new approach is that it does not only come with strong and robust performance improvements but also with theoretical guarantees regarding the convergence of Bayesian Optimization.
[Paper | GitHub as part of SMAC3]
Explaining HPO
Last year, we showed that applying simple approaches from iML, such as partial dependence plots (PDP) to HPO, provides biased explanations, i.e., over-confident close to the highly sampled optimum regions but far off on rarely sampled regions. Therefore, we wondered whether we could propose a new Bayesian-Optimization algorithm that is sample efficient and also samples in regions that help us to return good explanations (e.g., PDPs). It turns out that BAX can be combined with traditional BO s.t. we achieve these desiderata.
[Paper]
DeepCAVE
Depending on the complexity of the AutoML task and the required runtime to train a model, AutoML packages require runtimes of minutes or days. Similar to TensorBoard or WandB, a package for monitoring AutoML would be very helpful that allows getting a better understanding and insights into the AutoML process. With DeepCAVE, we have a new package tailored towards monitoring and explaining AutoML processes (such as SMAC, Auto-Sklearn, or AutoPyTorch). It provides an interactive dashboard incl. the current performance of the final model, Pareto fronts in case of multi-objective optimization, importance of hyperparameters and other design decisions, hyperparameter effects on the performance with parallel coordinate plots and PDPs and footprints of the AutoML sampler.
AutoML Packages
SMAC3
We have been developing our main HPO package SMAC3 since September 2015. Roughly seven years and 2000 commits later, we have our first paper dedicated to the software itself. In our JMLR MLOOS, we briefly described the main components of SMAC3 and how they interact. Of course, the development of SMAC3 is not finished, and we recently released a new alpha version of the next major release of SMAC3 that redesigns the external and internal APIs, improves usability, speeds up parallel HPO, and improves the quality of multi-objective HPO. Try it out! Feedback is very much appreciated.
[Paper | GitHub | Alpha Release]
Auto-Sklearn 2.0
Our oldest (w.r.t. the still maintained packages; May 2014) and most popular package is Auto-Sklearn. Although it automates the design process of ML pipelines, it also has different settings, such as validation or data-splitting strategies, that impact its own performance. An efficient setting of Auto-Sklearn depends on the dataset at hand and its properties. To free the users from the burden of deciding these meta-design decisions, we proposed to use meta-learning in combination with portfolio construction. Our recent Auto-Sklearn 2.0 paper showed that this leads to better and more robust performance. So, we are again one step closer to our dream of automated machine learning.
See you with new exciting ideas again in 2023!