Mlpack 3.0 Released: A Fast, Flexible Machine Learning Library

April 4, 2018, 7:27 p.m. By: Kirti Bakshi

Mlpack 3.0

mlpack is a fast, intuitive, as well as a flexible machine learning library, written in C++ with bindings to other languages. It is basically meant to be a machine learning analog to LAPACK, and for machine learning researchers the library aims at implementing a wide array of machine learning methods and functions. In addition to its powerful C++ interface, mlpack also provides its users with command-line programs as well as Python bindings.

Now, Moving onto mlpack 3.0.0:

Over the years the project has now grown into a community-led effort for fast machine learning implementations in C++ and this release is the result of the culmination of the development that is more than a decade worth and support of more than 100 contributors from around the world that also includes one AI contributor. Among all the other things, This release includes:

  • A generic optimization infrastructure,

  • Python bindings,

  • Support for deep learning,

  • Implementations of machine learning algorithms with more improvement.

Back in 2007, mlpack was just a small project at a single lab in Georgia Tech that only focused on nearest neighbour search and techniques that were related.

Now, this year, the library is developed and is being used all around the world and in space as well! It's a regular part of Google Summer of Code and is concerned with the implementation of all manner of general and specialized machine learning techniques.

Interfaces to Python and Other Languages:

For the release of mlpack 3.0, there is the creation of a system in order to provide bindings to Python that have the same interface as their command-line bindings. In addition, they are also further planning to generate bindings for other languages, such as Scala, MATLAB, C#, and Java, as well as many others.

Also, When it comes to build, mlpack makes the use of CMake as a build system and thus allows several flexible build configuration options. One for further documentation can as well consult any of numerous CMake tutorials.

New And Improved Functionalities in mlpack 3.0.0:

Since the last release of mlpack, (mlpack 2.2.5), there is a lot that has been added and changed. Much of this took place due to projects from the Google Summer of Code. Given below is a short list of all the new and improved functionalities:

  • Optimization infrastructure

  • Deep learning infrastructure that allows support for FNNs, CNNs, as well as RNNs, and also a lot of existing layer types and support for custom layers.

Addition of New optimizers like:

  • SPALeRA, Katyusha, LineSearch, AdaGrad,ParallelSGD, FrankWolfe, SGDR, SMORMS3, SVRG etc.

  • Addition of Fast random forest implementation to the set of classifiers that are implemented by mlpack.

  • Addition of a hyperparameter tuning and cross-validation infrastructure.

Finally, How Can We Forget Its Ability To Be Modular By Design?

Since mlpack has been designed in a modular way, An individual, for a specific task, can drop in custom functionality. For, let's say, if an individual wants for the nearest neighbour search to use a custom metric or if instead wants to make the use of a custom criterion for splitting the decision trees, all that is required is to simply just write the code and it plugs in with no runtime overhead.

In addition to the above, because mlpack has been built on Armadillo, so a user can plug in any BLAS as per requirement and need. Just so you know, OpenBLAS is a good, fast, choice that comes with parallelization that is built in. One could also make the use of NVBLAS, which will if you have GPUs available outsource heavy-duty matrix computations to the GPU.

For more information, go through the links mentioned below:

Official Link: MlPack

For More Information: GitHub