Using Evolutionary AutoML to Discover Neural Network Architectures
The brain from very simple worm brains million years ago has evolved over a long time, to a diversity of modern structures today. The human brain, say can accomplish a wide variety of activities, many of them effortlessly.
To perform activities like these, artificial neural networks require careful design by experts over years of difficult research and typically address one specific task. One approach to generate these architectures is through the use of evolutionary algorithms.
But the questions that arise are: In addition to learning-based approaches, could we use our computational resources at unprecedented scale programmatically evolve image classifiers. Can we with minimal expert participation achieve solutions? How good can today's artificially-evolved neural networks be? We address these questions through two papers.
-
In “Large-Scale Evolution of Image Classifiers,” presented at ICML 2017, there was a set up of an evolutionary process with simple building blocks and trivial initial conditions.. Starting from very simple networks, the process found classifiers comparable to hand-designed models at the time. This was encouraging because as applications may require little user participation.
-
Thus, in this recent paper, “Regularized Evolution for Image Classifier Architecture Search” (2018), there is a participation in the process by providing sophisticated building blocks and good initial conditions. Moreover, the scaling up of computation was done using Google's new TPUv2 chips. This combination of modern hardware, expert knowledge, and evolution worked together to produce state-of-the-art models on two popular benchmarks for image classification: CIFAR-10 and ImageNet.
The mutations in the first paper were purposefully simple: remove at random a convolution, add a skip connection between arbitrary layers, or change the learning rate. This way, the results opposed to the quality of the search space show the potential of the evolutionary algorithm.
In this paper, in addition to evolving the architecture, the population while exploring the search space of initial conditions and learning-rate schedules trains its networks. As a result, the process with optimized hyper parameters yields fully trained models. After the experiment starts, no expert input is needed.
In all the above, even though by having simple initial architectures and intuitive mutations, we were minimizing the researcher's participation a good amount of expert knowledge went into the building blocks those architectures were made of. These included important inventions such as batch-normalization layers convolutions and ReLUs.
There is now an evolving of an architecture made up of these components. The term "architecture" is not accidental: this is analogous to constructing a house with high-quality bricks.
Combination of Evolution with Hand Design:
After the first paper, the aim was by giving the algorithm fewer choices to explore to reduce the search space to something more manageable. Making use of the architectural analogy, resulted in the removal of all the possible ways of making large-scale errors. Similarly, by fixing the large-scale structure of the network, with neural network architecture searches, the algorithm can be helped out. So how do we do this? For the purpose of architecture, the inception-like modules search proved very powerful. Their idea is to have cells: a deep stack of repeated modules. The stack is fixed but the architecture of the individual modules can change.
The second paper, “Regularized Evolution for Image Classifier Architecture Search” (2018), presented the results of applying evolutionary algorithms to the search space. The mutations modify the cell by randomly reconnecting the inputs or randomly replacing the operations.
These mutations relatively are still simple, but the initial conditions are not: the population is now initialized with models that must conform to the outer stack of cells. Even though the cells in these seed models are random, no longer there is any starting from simple models, which in the end makes it easier to get to high-quality models. If the evolutionary algorithm is contributing meaningfully, within this search space the final networks should be significantly better than the networks that are already known can be constructed. The paper shows that evolution can indeed find state-of-the-art models that either match or outperform hand-designs.
One important feature of the evolutionary algorithm that is used in the second paper is a form of regularization: instead of letting the worst neural networks simply die, there is a removal of the oldest ones regardless of how good they are. This tends to produce more accurate networks in the end as it improves robustness to changes in the task being optimized. Therefore, this form of regularization selects for networks that remain good when they are re-trained.
The state-of-the-art models that is evolved are nicknamed AmoebaNets, and are one of the latest results from the AutoML efforts. All these experiments took a lot of computation and it is hoped that in the future these experiments will become household. Here the aim was to provide a glimpse into that future.
Link: Regularized Evolution for Image Classifier Architecture Search
PDF Link: Click Here