Skip to content

Commit

Permalink
Merge branch 'main' of github.com:tonywu71/conditional-neural-processes
Browse files Browse the repository at this point in the history
  • Loading branch information
William-Baker committed Mar 27, 2023
2 parents a8db14e + 62a4ee7 commit 2d10e14
Show file tree
Hide file tree
Showing 10 changed files with 483 additions and 341 deletions.
24 changes: 20 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
# Neural Processes
Replication of the *Conditional Neural Processes* paper by Marta Garnelo et al. Conditional Neural Processes, 2018 [[arXiv](https://arxiv.org/abs/1807.01613)] and the *Neural Processes* paper by Marta Garnelo et al. Neural Processes, 2018 [[arXiv](https://arxiv.org/abs/1807.01622)]. The model and data pipeline were implemented using Tensorflow 2.10.
Replication of the *Conditional Neural Processes* paper by Marta Garnelo et al., 2018 [[arXiv](https://arxiv.org/abs/1807.01613)] and the *Neural Processes* paper by Marta Garnelo et al, 2018 [[arXiv](https://arxiv.org/abs/1807.01622)]. The model and data pipeline were implemented using Tensorflow 2.10.

Code released in complement of the [report](https://github.com/tonywu71/conditional-neural-processes/blob/b90eb9ecf18cebaa7bcae8dcbf0bc573f987b112/report/MLMI4_CNP_LNP_report.pdf) and the [poster](https://github.com/tonywu71/conditional-neural-processes/blob/b90eb9ecf18cebaa7bcae8dcbf0bc573f987b112/report/MLMI4_CNP_LNP_poster.pdf).



Expand All @@ -11,9 +13,9 @@ William Baker, Alexandra Shaw, Tony Wu

## 1. Introduction

While neural networks excel at function approximation, Gaussian Processes (GPs) addresses different challenges such as continuous learning, uncertainty prediction and the ability to deal with data scarcity. Therefore, each model is only suited for a restricted spectrum of tasks that strongly depends on the nature of available data.
While neural networks excel at function approximation, Gaussian Processes (GPs) address different challenges such as uncertainty prediction, continuous learning, and the ability to deal with data scarcity. Therefore, each model is only suited for a restricted spectrum of tasks that strongly depends on the nature of available data.

We will investigate Conditional Neural Processes (CNPs) and Latent Neural Processes (LNPs). These 2 types of model are part of the Neural Processes family (NPs). NPs are based on the idea of treating functions as random variables and using a neural network to encode the distribution over functions. This allows for efficient inference and scalability to large datasets. The performance on these models will be evaluated on 1D-regression and image completion to demonstrate how they learn distributions over complex functions.
Neural Processes use neural networks to encode distributions over functions to approximate the dis- tributions over functions given by stochastic processes like GPs. This allows for efficient inference and scalability to large datasets. The performance of these models will be evaluated on 1D-regression and image completion to demonstrate visually how they learn distributions over complex functions.



Expand Down Expand Up @@ -104,10 +106,24 @@ python train.py --task celeb

![celebA-image_completion](figs/4-experiments/celebA-image_completion.jpg)

<p align = "center"> <b>Figure 6: : CNP pixel mean and variance predictions on images from CelebA</b></p>
<p align = "center"> <b>Figure 6: CNP pixel mean and variance predictions on images from CelebA</b></p>



### 4.4. Extension: HNP and HNPC

**Objective:** Combine the deterministic link between the context representations (used by CNP) with the non-deterministic link from the latent space representation space (used by LNP) to produce a model with a richer embedding space.

![extension-hnp_hnpc](figs/4-experiments/extension-hnp_hnpc.png)

<p align = "center"> <b>Figure 7: Latent Variable Distribution - Mean and Standard Deviation Statistics during training.</b></p>



## 5. Appendix

To go further, read the poster and the report that can be found in the `report` folder of this repository.

<img src="report/poster-thumbnail.jpg" alt="poster-thumbnail" style="zoom: 33%;" />

<p align = "center"> <b>Figure 8: Miniature of the CNP/LNP poster</b></p>
Original file line number Diff line number Diff line change
@@ -0,0 +1,191 @@
from functools import partial
from typing import Optional, Tuple, Callable, Iterator

import tensorflow as tf
import tensorflow_probability as tfp

tfd = tfp.distributions

from dataloader.dataloader_for_plotting.regression_data_generator_base import RegressionDataGeneratorBase


def gen_from_arbitrary_gp(
batch_size: int,
iterations: int,
min_kernel_length_scale: float,
max_kernel_length_scale: float,
min_num_context: int,
max_num_context: int,
min_num_target: int,
max_num_target: int,
min_x_val_uniform: float,
max_x_val_uniform: float,
testing: bool):
"""Generates a batch of data for regression based on the original Conditional Neural Processes paper.
Note that the data is generated batch-wise.
During training and for each batch:
- Both num_context and num_target are drawn from uniform distributions
- The (num_context + num_target) x_values are drawn from a uniform distribution
- A Gaussian Process with predefined kernel and a null mean function is used to generate the y_values from the x_values
"""

for _ in range(iterations):
# NB: The distribution of y_values is the same for each iteration (i.e. the the one defined by
# the arbitrary GP) but the sampled x_values do differ (in terms of size and values).
num_context = tf.random.uniform(shape=[],
minval=min_num_context,
maxval=max_num_context,
dtype=tf.int32)

if not testing:
num_target = tf.random.uniform(shape=[],
minval=min_num_target,
maxval=max_num_target,
dtype=tf.int32)
else:
# If testing, we want to use a fixed number of points for the target
num_target = max_num_target - 1 # -1 because max_num_target is exclusive

num_total_points = num_context + num_target

x_values = tf.random.uniform(shape=(batch_size, num_total_points, 1),
minval=min_x_val_uniform, # type: ignore
maxval=max_x_val_uniform)


# Set kernel length scale:
l1 = tf.random.uniform(shape=[],
minval=min_kernel_length_scale, # type: ignore
maxval=max_kernel_length_scale,
dtype=tf.dtypes.float32)

l2 = tf.random.uniform(shape=[],
minval=min_kernel_length_scale, # type: ignore
maxval=max_kernel_length_scale,
dtype=tf.dtypes.float32)


# Varying kernel:
kernel_1 = tfp.math.psd_kernels.ExponentiatedQuadratic(length_scale=l1)
kernel_2 = tfp.math.psd_kernels.ExponentiatedQuadratic(length_scale=l2)

n_samples_1 = tf.random.uniform(shape=[], minval=2, maxval=num_total_points-1, dtype=tf.int32) # both splits will have at least one sample

# Sort x_values:
x_values = tf.sort(x_values, axis=1)

# Split x_values into two parts such that the first part has n_samples_1 points:
x_values_1 = x_values[:, :n_samples_1, :]
x_values_2 = x_values[:, n_samples_1:, :]


gp_1 = tfd.GaussianProcess(kernel_1, index_points=x_values_1, jitter=1.0e-4)
y_values_1 = tf.expand_dims(gp_1.sample(), axis=-1)

gp_2 = tfd.GaussianProcess(kernel_2, index_points=x_values_2, jitter=1.0e-4)

gp_2 = tfd.GaussianProcessRegressionModel(
kernel=kernel_2,
index_points=x_values_2[:],
observation_index_points=x_values_1[:, -1:, :],
observations=y_values_1[:, -1:, 0],
observation_noise_variance=1.0e-4)

y_values_2 = tf.expand_dims(gp_2.sample(), axis=-1)

y_values = tf.concat([y_values_1, y_values_2], axis=1)

idx = tf.random.shuffle(tf.range(num_total_points))

# Select the targets which will consist of the context points
# as well as some new target points
target_x = x_values[:, :, :]
target_y = y_values[:, :, :] # type: ignore

# Select the observations
context_x = tf.gather(x_values, indices=idx[:num_context], axis=1)
context_y = tf.gather(y_values, indices=idx[:num_context], axis=1)

if all(tf.shape(context_x) != tf.shape(context_y)):
continue
if all(tf.shape(target_x) != tf.shape(target_y)):
continue
if tf.shape(context_x)[-1] != tf.shape(target_x)[-1]:
continue

yield (context_x, context_y, target_x), target_y, l1, l2


class RegressionDataGeneratorArbitraryGPWithVaryingKernel(RegressionDataGeneratorBase):
"""Class that generates a batch of data for regression based on
the original Conditional Neural Processes paper."""
def __init__(self,
iterations: int=250,
batch_size: int=32,
min_num_context: int=3,
max_num_context: int=10,
min_num_target: int=2,
max_num_target: int=10,
min_x_val_uniform: int=-2,
max_x_val_uniform: int=2,
n_iterations_test: Optional[int]=None,
min_kernel_length_scale: float=0.1,
max_kernel_length_scale: float=1.):
super().__init__(iterations=iterations,
batch_size=batch_size,
min_num_context=min_num_context,
max_num_context=max_num_context,
min_num_target=min_num_target,
max_num_target=max_num_target,
min_x_val_uniform=min_x_val_uniform,
max_x_val_uniform=max_x_val_uniform,
n_iterations_test=n_iterations_test)

self.min_kernel_length_scale = min_kernel_length_scale
self.max_kernel_length_scale = max_kernel_length_scale

self.train_ds, self.test_ds = self.load_regression_data()


def get_gp_curve_generator(self, testing: bool=False) -> Callable:
"""Returns a function that generates a batch of data for regression based on
the original Conditional Neural Processes paper."""
return partial(gen_from_arbitrary_gp,
batch_size=self.batch_size,
iterations=self.iterations,
min_kernel_length_scale=self.min_kernel_length_scale,
max_kernel_length_scale=self.max_kernel_length_scale,
min_num_context=self.min_num_context,
max_num_context=self.max_num_context,
min_num_target=self.min_num_target,
max_num_target=self.max_num_target,
min_x_val_uniform=self.min_x_val_uniform,
max_x_val_uniform=self.max_x_val_uniform,
testing=testing)


def draw_single_example_from_arbitrary_gp(min_kernel_length_scale, max_kernel_length_scale, num_context, num_target):
data_generator = RegressionDataGeneratorArbitraryGPWithVaryingKernel(
iterations=1,
n_iterations_test=1,
batch_size=1,
min_num_context=num_context-1,
max_num_context=num_context,
min_num_target=num_target-1,
max_num_target=num_target,
min_x_val_uniform=-2,
max_x_val_uniform=2,
min_kernel_length_scale=min_kernel_length_scale,
max_kernel_length_scale=max_kernel_length_scale
)

train_ds, test_ds = data_generator.load_regression_data()
(context_x, context_y, target_x), target_y, l1, l2 = next(iter(test_ds))

context_x = tf.squeeze(context_x, axis=0)
context_y = tf.squeeze(context_y, axis=0)
target_x = tf.squeeze(target_x, axis=0)
target_y = tf.squeeze(target_y, axis=0)

return (context_x, context_y, target_x), target_y, l1, l2
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
from typing import Callable, Optional, Tuple
from abc import ABC, abstractmethod

import tensorflow as tf
import tensorflow_probability as tfp

tfd = tfp.distributions

import matplotlib.pyplot as plt


class RegressionDataGeneratorBase(ABC):
"""Abstract base class for regression data generators."""
def __init__(self,
iterations: int,
batch_size: int,
min_num_context: int,
max_num_context: int,
min_num_target: int,
max_num_target: int,
min_x_val_uniform: int,
max_x_val_uniform: int,
n_iterations_test: Optional[int]=None):
self.iterations = iterations
self.batch_size = batch_size

assert min_num_context < max_num_context, "min_num_context must be smaller than max_num_context"
self.min_num_context = min_num_context
self.max_num_context = max_num_context

assert min_num_target < max_num_target, "min_num_target must be smaller than max_num_target"
self.min_num_target = min_num_target
self.max_num_target = max_num_target

assert min_x_val_uniform < max_x_val_uniform, "min_val_uniform must be smaller than max_val_uniform"
self.min_x_val_uniform = min_x_val_uniform
self.max_x_val_uniform = max_x_val_uniform

if n_iterations_test is None:
self.n_iterations_test = self.iterations // 10
else:
self.n_iterations_test = n_iterations_test

# The following attributes will be set when calling load_regression_data() from
# the child class:
self.train_ds: tf.data.Dataset = None
self.test_ds: tf.data.Dataset = None


@abstractmethod
def get_gp_curve_generator(self, testing: bool=False) -> Callable:
"""Returns a generator function that generates regression data from a Gaussian Process."""
pass


def load_regression_data(self) -> Tuple[tf.data.Dataset, tf.data.Dataset]:
"""Returns a tuple of training and test datasets."""
train_ds = tf.data.Dataset.from_generator(
self.get_gp_curve_generator(testing=False),
output_types=((tf.float32, tf.float32, tf.float32), tf.float32, tf.float32, tf.float32)
)
test_ds = tf.data.Dataset.from_generator(
self.get_gp_curve_generator(testing=True),
output_types=((tf.float32, tf.float32, tf.float32), tf.float32, tf.float32, tf.float32)
)

train_ds = train_ds.prefetch(tf.data.experimental.AUTOTUNE) # No need to shuffle as the data is already generated randomly
test_ds = test_ds.prefetch(tf.data.experimental.AUTOTUNE)

return train_ds, test_ds


@staticmethod
def plot_first_elt_of_batch(context_x, context_y, target_x, target_y,
ax: Optional[plt.Axes]=None,
figsize=(8, 5)):
"""Plot the first element of a batch."""

if ax is None:
fig, ax = plt.subplots(figsize=figsize)

context_x = context_x.numpy()
context_y = context_y.numpy()
target_x = target_x.numpy()
target_y = target_y.numpy()

ax.scatter(target_x[0, :, 0], target_y[0, :, 0], c="blue", label='Target')
ax.scatter(context_x[0, :, 0], context_y[0, :, 0], marker="x", c="red", label='Observations')
ax.legend()

return ax


def plot_first_elt_of_random_batch(self, figsize=(8, 5)):
"""Plot a random batch from the training set."""
(context_x, context_y, target_x), target_y, l1, l2 = next(iter(self.train_ds.take(1)))
ax = RegressionDataGeneratorBase.plot_first_elt_of_batch(context_x, context_y, target_x, target_y, figsize=figsize)
return ax
Binary file added figs/4-experiments/extension-hnp_hnpc.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit 2d10e14

Please sign in to comment.