Neural Processes.
A Neural Network (NN) is a parameterized function that through a gradient descent can be tuned to approximate with high precision a labelled collection of data. A Gaussian process (GP), on the other hand, is a probabilistic model that is data-efficient, flexible and defines a distribution over possible functions, and via the rules of probabilistic inference is updated in light of data but are also computationally intensive and thus limited in their applicability.
In this paper, you are introduced to a class of neural latent variable models called Neural Processes (NPs).
Similar to GPs, NPs can:
-
Define distributions over functions,
-
Rapidly Adapt to observations that are new,
-
Be Capable of estimating the uncertainty in their predictions.
Like NNs, NPs are:
-
During training and evaluation Computationally efficient
-
Can learn to adapt their priors to data.
Here there is a demonstration of the performance of Neural Processes on a range of learning tasks, that include regression and optimization.
Introduction:
In Machine Learning, function approximation lies at the core of numerous problems and over the past decade, one approach that has been exceptionally popular for this purpose are deep neural networks. At a high-level neural network constitute black-box function approximators that from a large number of training data points learn to parameterize a single function.
As an alternative to using neural networks in order to carry out function regression, one can also perform inference on a stochastic process. The most common instantiation of this approach is, a model with complementary properties to those of neural networks: Gaussian process (GP): GPs do not require a costly training phase and on some observations can carry out inference about the underlying ground truth function conditioned. In addition, GPs represent infinitely many different functions at locations that have not been observed thereby when given some observations capturing the uncertainty over their predictions.
However, GPs are computationally expensive. Furthermore, the available kernels are usually restricted in their functional form and an additional optimization procedure is required to identify the most suitable kernel, as well as its hyperparameters, for any given task. As a result, there is growing interest in combining aspects of neural networks and inference on stochastic processes as a potential solution to some of the downsides of both. In this paper, you are introduced to a neural network-based formulation that learns an approximation of a stochastic process, termed as Neural Processes (NPs). NPs display some of the fundamental properties of GPs, namely they learn to model distributions over functions, are able to estimate the uncertainty over their predictions conditioned on context observations, and shift some of the workloads from training to test time, which allows for model flexibility. Crucially, NPs generate predictions in a computationally efficient way. Furthermore, the model overcomes many functional design restrictions by learning an implicit kernel from the data directly.
Their main contributions are:
-
Introduction of Neural Processes, a class of models that combine benefits of neural networks and stochastic processes.
-
Comparison of NPs to related work in meta-learning, deep latent variable models and Gaussian processes. Given that NPs are linked to many of these areas, they form a bridge for comparison between many related topics.
-
Showcasing of the benefits and abilities of NPs by applying them to a range of tasks including Bayesian optimization, real-world image completion, 1-D regression, and contextual bandits.
Related work:
-
Conditional neural processes
-
Gaussian processes
-
Meta-learning
-
Bayesian methods
-
Conditional latent variable models
Conclusion:
Here, in this paper, they introduce Neural processes, a family of models that combine the benefits of stochastic processes and neural networks. NPs learn to represent distributions over functions and make flexible predictions at test time conditioned on some context input. Instead of requiring a handcrafted kernel, NPs learn an implicit measure from the data directly. They then apply NPs to a range of regression tasks to showcase their flexibility. The goal of this paper is to introduce NPs and compare them to the currently ongoing research. As such, the tasks presented here are diverse but relatively low dimensional.
Link To The PDF: Click Here