The lack of widespread adoption of AutoML tools in the broader ML community has been a recurring topic of discussion within the field. Is this due to a lack of trust in these systems? Do our benchmarks fail to reflect real-world use cases? Or is it simply too difficult to find and implement state-of-the-art methods?
A significant factor is undoubtedly the engineering gap between current AutoML tools and user demands, particularly when it comes to training expensive deep learning models. Hyperparameter optimization (HPO) is crucial for all ML practitioners, and there are several open-source HPO packages that have been proven to work well across many benchmarks, such as our own SMAC, SyneTune, and others. Despite this, the most widely adopted HPO packages are Ray Tune and Optuna. Their popularity is not because they have superior algorithms or better benchmarking results, but because they are easier to use. So, does this mean we must rely on third parties to re-implement effective solutions for broader usability in the ML community?
Figure 1: Hypersweeper as a shim: ask-tell communication with the HPO tool and tuning parallelization for the ML training pipeline.
We believe there is a middle ground to make state-of-the-art AutoML code ready for everyday use: Hypersweeper. We provide a unified interface combined with cluster parallelization for ask-tell solvers. A lightweight shim ensures the “ask” and “tell” functions of different HPO tools align into a single interface for end users (see Figure 1). On the user side, training code needs almost no modifications: it needs to be runnable with hydra and return a performance metric. Once the training code is compatible with Hypersweeper, as little as a single line in the configuration file is enough to have a new HPO method available (see Figure 2 for an example of how to tune with Hypersweeper).
Figure 2: Full code example for SMAC3 with the branin function: on the left branin is configured via hydra. This means adding the hydra decorator and the “cfg” argument to the branin function. The resulting function value is returned to the sweeper. Meanwhile on the right, the corresponding YAML configuration file defines a search space of two hyperparameters “x0” and “x1” and configures SMAC to run for 10 trials. The same SMAC configuration could easily be re-used with a different search space.
Both Hypersweeper’s interface and parallelization are built on top of Hydra. This means the configuration of the HPO tools occurs in YAML files. These are concise, easily sharable and can separate the configuration of HPO, algorithm and benchmark – such that HPO configurations can easily be shared and re-used. Hydra’s built-in cluster parallelization, in addition to local parallelization options, supports SLURM, Joblib, RQ, and Ray clusters and is well documented. This provides significant flexibility for users with minimal implementation overhead for AutoML researchers. That means AutoML researchers can take advantage of the ease of use, parallelization functionality and logging Hypersweeper provides without diverting their focus from their research.
In summary, with Hypersweeper, cutting-edge AutoML research code can be heavily parallelized without significant changes to the training code. Integrating state-of-the-art HPO tools into any ML training loop is now as simple as using the provided configuration files and returning the relevant metric for tuning. We believe Hypersweeper will become a crucial connection point between AutoML research and the wider ML community, offering mutual benefits through a broader user base for HPO tools.
Hypersweeper is open-sourced on GitHub and also installable via PyPI.