Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider interpolation HDI calculations #2168

Open
tomicapretto opened this issue Nov 25, 2022 · 12 comments
Open

Consider interpolation HDI calculations #2168

tomicapretto opened this issue Nov 25, 2022 · 12 comments

Comments

@tomicapretto
Copy link
Contributor

Tell us about it

In Bambi we have a function called plot_cap that is used to obtain visualizations of the fitted curve. We overlay a credible interval so users can visualize the uncertainty around the mean estimate. Internally, we're using az.hdi() to obtain the bounds. Today, I was implementing some improvements and found the plots look quite noisy. See the following examples

import arviz as az
import bambi as bmb
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
print(az.__version__)
# 0.14.0

The following is Bambi specific code, it's not that important for what I want to show

data = pd.read_csv("https://gist.githubusercontent.com/seankross/a412dfbd88b3db70b74b/raw/5f23f993cd87c283ce766e7ac6b329ee7cc2e1d1/mtcars.csv")
model = bmb.Model("mpg ~ 1 + hp", data)
idata = model.fit(random_seed=1234)

# Obtain predictiosn
new_data = pd.DataFrame({"hp": np.linspace(50, 320, 200)})
idata = model.predict(idata, data=new_data, inplace=False)
y_hat = idata.posterior["mpg_mean"]

Get the bands using az.hdi()

y_hat_bounds = az.hdi(y_hat, 0.94)["mpg_mean"].T.to_numpy()
fig, ax = plt.subplots(figsize=(7, 5), dpi=120)
ax.fill_between(new_data["hp"], y_hat_bounds[0], y_hat_bounds[1], alpha=0.5);

image

Get the bands using .quantile() in DataArray, which calls np.quantile under the hood (if I understood correctly)

y_hat_bounds = y_hat.quantile(q=(0.03, 0.97), dim=("chain", "draw"))
fig, ax = plt.subplots(figsize=(7, 5), dpi=120)
ax.fill_between(new_data["hp"], y_hat_bounds[0], y_hat_bounds[1], alpha=0.5);

image

Thoughts on implementation

I'm not aware of the historical details that led to the current implementation of az.hdi(). But I think it's worth considering other alternatives since the current behavior returns very noisy results. I have other examples where it looks even worse, for example here

image

Tagging @aloctavodia because we talked about this via chat

@aloctavodia
Copy link
Contributor

If I understood correctly the computation of the quantile function by xarray involves an interpolation before returning the values.

Instead the az.hdi function doesn't. It just returns the raw values, which generally looks very noisy when plotted. In arviz we solve thi problem using the az.plot_hdi, which has a smooth argument, being true by default.

We can use az.plot_hdi inside bambi, but that will requiere to change the logic of the plot_cap function.

One alternative will be to implement a hdi function that behaves similar to that from xarray, that is to return interpolated values.

@tomicapretto
Copy link
Contributor Author

tomicapretto commented Nov 25, 2022

I think trying to do something similar to xarray is a good idea. I remember one example where I wanted to plot the mean and the HDI, using az.plot_hdi(), and the mean line would cross or fall to close of the boundary of the HDI (which didn't make sense in that example). I don't remember where and when that happened, but the problem was the smoothing that az.plot_hdi() applies.

EDIT I found the example I had in mind

image

Left: Using .quantile(). Right: Using az.plot_hdi().

The code is here https://github.com/bambinos/unfinished-ideas/blob/main/splines.ipynb

@OriolAbril
Copy link
Member

OriolAbril commented Dec 18, 2022

Adding some of these interpolation methods to hdi sounds like a good idea. As far as I understand, it is quite different from the current smoothing in plot_hdi, at least conceptually. The interpolation in numpy quantile works along the dimension that is reduced, so that if ary[30] and ary[31] are the closest to the desired quantile, the returned value is not necessarly one of these two but something in between. The hdi in arviz returns exactly one of the values in the provided samples, so a small change can mean a jump between ary[30] and ary[31] which results in this strange look. I am not sure how well would these interpolation methods behave for hdi though, do we know of some references on this?

Note, I think using quantile for hdi would be very misleading, so I propose changing the name of the issue to "Consider interpolation HDI calculations" so that we add something similar to this method argument in numpy (that can also be used through xarray).

@tomicapretto
Copy link
Contributor Author

@OriolAbril I think I understand why using quantiles would be misleading for HDIs, [P2.5%, P97.5%] doesn't necessarily give a 95% HDI, but I don't understand the name suggestion.

@OriolAbril
Copy link
Member

I don't see the current issue title (consider using np.quantile for hdi) as viable, and instead see two underlying issues that need addressing.

The first is being able to use any function to generate intervals or bands. This is a know issue and we are working on it, but imo it requires refactoring the plots module. I have started some experiments at https://xrtist.readthedocs.io/en/latest/ for example.

The second is stabilising our current hdi approach in a manner similar to what np.quantile does. I hadn't really realized the instability that comes with returning existing samples can actually be fixed with these interpolation methods. I don't see much value in focusing this issue on the first thing, but I think it would be very helpful to focus on this second one

@tomicapretto tomicapretto changed the title Consider np.quantile for HDI calculations Consider interpolation HDI calculations Dec 21, 2022
@OriolAbril
Copy link
Member

@tomicapretto #2207 (comment) might help clarify a bit the different interpolation options

@hadjipantelis
Copy link

hadjipantelis commented Apr 3, 2023

Should the quantiles be used directly though or have something like:

from scipy import stats
y_hat_bounds = y_hat.mean(dim=("chain", "draw")).to_numpy()[np.newaxis,:] + \
  np.dot(np.array(stats.t.interval(0.95, int(y_hat.draw.max())))[:, np.newaxis], 
              y_hat.std(dim=("chain", "draw")).to_numpy()[:, np.newaxis].T )
ax.fill_between(new_data["hp"], y_hat_bounds_2[0], y_hat_bounds_2[1], alpha=0.125, color='red'); 

@OriolAbril
Copy link
Member

I wouldn't recommend doing this. With MCMC you are getting samples from the whole posterior distribution (or posterior predictive too with the extra forward sampling step). Both quantile computation in numpy and hdi computation in arviz are non-parametric because we never now which is the actual distribution of these posterior samples.

Computing the percentiles from the student-t distribution assumes the posterior follows a student-t distribution which will not be the case in the vast majority of cases (even if the likelihood were a student-t distribution). Therefore, I wouldn't recommend that approach, especially not combined with MCMC, it could be useful if working with optimizer (so you only get the MAP estimates for the student-t parameters, not draws from the whole distribution) or if working with conjugate distributions).

@hadjipantelis
Copy link

Cool, all reasonable points, thank you. (I am not an experienced Bayesian Statistics user.)

So where do you want changes to be done? Should I fork arviz and do changes on _hdi, to the docs, to both? Somewhere else?

@OriolAbril
Copy link
Member

We first need to identify which algorithm to implement for the interpolation. HDI and quantiles (with which we get the ETI, equal tail intervals) are very different concepts, so we can't just apply the same interpolation that is done in the quantile computation. I haven't yet been able to search anything on this yet, either searching for papers on this or checking implementations of HDI in other libraries (for example in R or Julia) would be a good starting point. Is that something you'd be interested in?

Once we know what changes to make then yeah, fork and make the changes, then submit a PR for us to review. That would be very appreciated

@hadjipantelis
Copy link

Hmm...easystats/bayestestR#39 seems to touch upon this exactly but as bayestestR::hdi() and sjstats::hdi() both still use raw values then I cannot see how this would circumvent the behaviour noticed by @tomicapretto original code example.
The HDInterval vignette on the other hand particularly says: "For a 95% HDI, 10,000 independent draws are recommended; a smaller number will be adequate for a 80% HDI, many more for a 99% HDI." In fairness, using draws=10_000 on the above example ameliorates the phenomenon substantially. So is it a case of simply giving out a warning?

@OriolAbril
Copy link
Member

Thanks, that is great!

Printing a warning every time the function is executed seems a bit excessive, especially given it is quite arbitrary and only valid for 95% CI (if at all). We'd also need to compute ESS within HDI in order to get the number of independent draws. I think as a start the best would be to document all this properly in the notes section of the HDI docstring (which currently has no notes on the algorithm used at all). There we can document the algorithm used, cite Khruske's book and add some advise on samples above.

Regarding # of samples advise, in our case it would be best to advise using the MCSE of HDI to decide if there are enough samples. I'll need to finish #1974 so it can be used, but I think it is what makes more sense conceptually.

Do you want to send a PR to update the docs on this? We have some guidance at https://python.arviz.org/en/stable/contributing/index.html, and don't hesitate to ask any questions you may have during the process.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants