-
Notifications
You must be signed in to change notification settings - Fork 2
Example with one dataset
How I will show you a quick example of what gpyrn can do...
Let us imagine you are in you are in ipython.
To start lets import everything we will need. That is numpy
, matplotlib
, and of course our gpyrn
:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.ticker import AutoMinorLocator
from gpyrn import meanfield, covfunc, meanfunc
Now we need some data to analyze, I will not do anything fancy, a simple sine wave and some random errors will do:
time = np.linspace(0, 100, 25)
y1 = 20*np.sin(2*np.pi*time / 31)
y1err = np.random.rand(25)
If you plot it
plt.figure()
plt.errorbar(time, y1, y1err, fmt='ob', markersize=7, label='y1')
plt.xlabel('Time (days)')
plt.ylabel('Measurements')
plt.grid(which='major', alpha=0.5)
plt.show()
It should look like this
Now let us create a Gaussian process regression network (gprn). A gprn does not work like a normal Gaussian process (gp). On a gprn the posterior is not analytically solvable and needs to be approximated. To achieve this we use what is known as mean-field variational inference to approximate the posterior. It is explained better on the paper, to keep it simple you create the gprn with
gprn = meanfield.inference(1, time, y1, y1err)
The 1
stands for the number of nodes we will use, in this case just one. Speaking of nodes we now should created our nodes, weights, means and jitters
nodes = [covfunc.Periodic(15, 31, 0.5)]
weight = [covfunc.SquaredExponential(1, 1)]
means = [meanfunc.Constant(0)]
jitter = [0.5]
This simply means that our node is going to be a gp with a periodic kernel, and our weight will be a gp with a squared exponential kernel. Visually you can think at the gprn on this example connecting everything like
We also need a mean function that I defined as a constant. We also need a jitter (also know as white noise) that I gave the value of 0.5. On the the documentation inside each kernel you can find what are the parameters I just gave.
Know that we have all our things defined how can I know these nodes and weights are good for our model? Again, a gprn does not work like a gp and we don't have a log-likelihood. Instead we have a evidence lower bound (elbo).
elbo, m, v = gprn.ELBOcalc(nodes, weight, means, jitter, iterations=5000, mu='init', var='init')
print('ELBO =', elco )
For me it gave me the value of
ELBO = -139.88937647084526
That is nice but now what? Is it any good? Can we do better? To answer this let us compare it with the elbo of a new set of nodes, weights, etc... And recalculate the elbo
nodes = [covfunc.Periodic(15, 31, 0.5)]
weight = [covfunc.SquaredExponential(1, 100)]
means = [meanfunc.Constant(0)]
jitter = [0.5]
elbo, m, v = gprn.ELBOcalc(nodes, weight, means, jitter, iterations=5000, mu='init', var='init')
print('ELBO =', elco )
The result was
ELBO = -75.57901066842231
Simply put, the higher the elbo the better. And with this new configuration we clearly have a higher elbo. If you are used to work with gp regression you probably know why. Unfortunately on this example I will not explain you any better.
Now to end this first example let see who this second model fits to our data. We need to predict it using
tstar = np.linspace(time.min(), time.max(), 1000)
mean, _, _ = gprn.Prediction(nodes, weight, means, jitter, tstar, m, v)
plt.figure()
plt.errorbar(time, y1, y1err, fmt='ob', markersize=7, label='data')
plt.plot(tstar, mean[0], '--k', linewidth=2, label='predictive')
plt.xlabel('Time (days)')
plt.ylabel('Measurements')
plt.legend(loc='upper right', facecolor='white', framealpha=1, edgecolor='black')
plt.grid(which='major', alpha=0.5)
plt.show()
The end result should be something like
It is a very nice fit may I add. I think that with this simple example you now know the basics to use gpyrn
.
I will keep adding more example in the future, but for now that is all.
2018-2021 João Camacho