Nevergrad : A Python toolbox for performing gradient-free optimization
In order to tune parameters and/or hyper-parameters in their models, most tasks in Machine Learning rely on optimizations that are derivative-free. To make this faster as well as easier, they created and are now open-sourcing a Python3 library called Nevergrad.
What is Nevergrad?
Nevergrad is an open-source Python3 library that offers an extensive collection of algorithms that don't require gradient computation and presents them in a standard ask-and-tell Python framework.
The library that also includes testing and evaluation tools is now available for immediate use as a toolbox for AI researchers and many others whose work involves derivative-free optimization.
The platform enables them to implement in different settings the state-of-the-art algorithms and methods in order to compare performance. It also will help ML scientists find the best optimizer for specific use cases.
What are the goals of Nevergrad?
The goal of this package is to provide:
-
Gradient/derivative-free optimization algorithms, including algorithms that are able to handle noise.
-
Tools to instrument any code, making optimization of parameters/hyperparameters painless.
-
Functions on which to test the optimization algorithms.
-
Benchmark routines to compare algorithms easily.
The library also includes a wide range of optimizers, such as:
-
Differential evolution.
-
Sequential quadratic programming.
-
FastGA.
-
Covariance matrix adaptation.
-
Population control methods for noise management.
-
Particle swarm optimization.
Now, moving onto Nevergrad and ML Problems:
Previously, making the use of these algorithms often involved custom-built implementations that made it difficult to compare results from a wide range of state-of-the-art methods. But using Nevergrad, AI developers can now easily test many different methods on a particular ML problem and then compare the results. Or, they can use well-known benchmarks in order to evaluate how a new derivative-free optimization method compares with the current state of the art.
The gradient-free optimization's included in Nevergrad are used for a wide variety of ML problems, such as:
-
Multimodal problems, such as those that have several minima.
-
Ill-conditioned problems, which typically arise when trying to optimize several variables with very different dynamics.
-
Separable or rotated problems, including partially rotated problems.
-
Partially separable problems, which can be addressed by considering several blocks of variables.
-
Discrete, continuous, or mixed problems. These can include power systems or tasks with neural networks that require simultaneously choosing the learning rate per layer, the weight decay per layer, and the types of non-linearity per layer.
-
Noisy problems, i.e., problems for which the function can return different results when it is called with the exact same parameter, such as independent episodes in reinforcement learning.
Finally, Summing up:
As of now, this initial release of Nevergrad comprises of basic artificial test functions, and the plan is on adding more, including functions that represent physical models as well. Addition of functionalities to Nevergrad will be continued in order to help researchers create as well as benchmark new algorithms.
On the application side, the aim is to continue making Nevergrad easier to use, as well as explore using it to optimize parameters in PyTorch reinforcement learning models where the gradient has not been well-defined. Nevergrad also has the potential to help in parameter sweeps for other tasks, as well, such as A/B testing and job scheduling making it more diverse in its uses.
For further information related to Nevergrad, its structure, benchmarks and more, refer to the links mentioned below:
Source and information: GitHub