Skip to content

Example with one dataset

João Camacho edited this page Oct 8, 2021 · 1 revision

How I will show you a quick example of what gpyrn can do...


Let us imagine you are in you are in ipython.

To start lets import everything we will need. That is numpy, matplotlib, and of course our gpyrn:

import numpy as np
import matplotlib.pyplot as plt
from matplotlib.ticker import AutoMinorLocator
from gpyrn import meanfield, covfunc, meanfunc

Now we need some data to analyze, I will not do anything fancy, a simple sine wave and some random errors will do:

time = np.linspace(0, 100, 25)
y1 = 20*np.sin(2*np.pi*time / 31)
y1err = np.random.rand(25)

If you plot it

plt.figure()
plt.errorbar(time, y1, y1err, fmt='ob', markersize=7, label='y1')
plt.xlabel('Time (days)')
plt.ylabel('Measurements')
plt.grid(which='major', alpha=0.5)
plt.show()

It should look like this


Now let us create a Gaussian process regression network (gprn). A gprn does not work like a normal Gaussian process (gp). On a gprn the posterior is not analytically solvable and needs to be approximated. To achieve this we use what is known as mean-field variational inference to approximate the posterior. It is explained better on the paper, to keep it simple you create the gprn with

gprn = meanfield.inference(1, time, y1, y1err)

The 1 stands for the number of nodes we will use, in this case just one. Speaking of nodes we now should created our nodes, weights, means and jitters

nodes = [covfunc.Periodic(15, 31, 0.5)]
weight = [covfunc.SquaredExponential(1, 1)]
means = [meanfunc.Constant(0)]
jitter = [0.5]

This simply means that our node is going to be a gp with a periodic kernel, and our weight will be a gp with a squared exponential kernel. Visually you can think at the gprn on this example connecting everything like


We also need a mean function that I defined as a constant. We also need a jitter (also know as white noise) that I gave the value of 0.5. On the the documentation inside each kernel you can find what are the parameters I just gave.

Know that we have all our things defined how can I know these nodes and weights are good for our model? Again, a gprn does not work like a gp and we don't have a log-likelihood. Instead we have a evidence lower bound (elbo).

elbo, m, v = gprn.ELBOcalc(nodes, weight, means, jitter, iterations=5000, mu='init', var='init')
print('ELBO =', elco )

For me it gave me the value of

ELBO = -139.88937647084526

That is nice but now what? Is it any good? Can we do better? To answer this let us compare it with the elbo of a new set of nodes, weights, etc... And recalculate the elbo

nodes = [covfunc.Periodic(15, 31, 0.5)]
weight = [covfunc.SquaredExponential(1, 100)]
means = [meanfunc.Constant(0)]
jitter = [0.5]
elbo, m, v = gprn.ELBOcalc(nodes, weight, means, jitter, iterations=5000, mu='init', var='init')
print('ELBO =', elco )

The result was

ELBO = -75.57901066842231

Simply put, the higher the elbo the better. And with this new configuration we clearly have a higher elbo. If you are used to work with gp regression you probably know why. Unfortunately on this example I will not explain you any better.


Now to end this first example let see who this second model fits to our data. We need to predict it using

tstar = np.linspace(time.min(), time.max(), 1000)
mean, _, _ = gprn.Prediction(nodes, weight, means, jitter, tstar, m, v)
    
plt.figure()
plt.errorbar(time, y1, y1err, fmt='ob', markersize=7, label='data')
plt.plot(tstar, mean[0], '--k', linewidth=2, label='predictive')
plt.xlabel('Time (days)')
plt.ylabel('Measurements')
plt.legend(loc='upper right', facecolor='white', framealpha=1, edgecolor='black')
plt.grid(which='major', alpha=0.5)
plt.show()

The end result should be something like


It is a very nice fit may I add. I think that with this simple example you now know the basics to use gpyrn.

I will keep adding more example in the future, but for now that is all.