Plot projections and eval scores #115

Fuhan-Yang · 2025-02-03T14:33:41Z

Add diagnostic plots of time-varying projections compared with data, and evaluation scores differed by forecast dates. Minor bugs showed up when using different forecast periods. Worth to try multiple forecast periods to ensure the robustness of the pipeline.

eschrom

Awesome work, @Fuhan-Yang! I apologize that I'm not well-versed in altair, so I cannot review the actual plotting with much detail. But I did find one spot that I think needs a change - the forecasts are saved in long, not wide, format now. Otherwise, there are just some questions peppered throughout.

eschrom · 2025-02-04T14:58:52Z

scripts/proj_plot.py

+    data = pl.scan_parquet(args.obs).collect()
+
+    # Drop all samples and just use mean estimate, for now
+    pred = pred.drop([col for col in pred.columns if "estimate_" in col])


This line expects the forecasts in wide format, but they are now in long format. See Line ~592 in iup/models.py, which returns predictions from a model as SampleForecasts (with sample_id and estimate columns), and then see Line ~85 in scripts/forecast.py, which averages over samples to get the mean of the posterior predictive distribution at each time_end before saving the forecasts.

I forget if this was being discussed elsewhere, but I have a strong preference for long format for these things

It was, and we fixed it!

Thanks for catching that! This line may adapt from previous version and didn't raise error because the condition didn't meet. We indeed switched to long format, and just average over for now. This line is deleted.

eschrom · 2025-02-04T15:03:43Z

scripts/proj_plot.py

+
+    charts = []
+
+    forecast_starts = [date for date in pred["forecast_start"].unique()]


Should we check that the pred frame is already sorted by forecast_start to make sure that the plot panels come out in a sensible order? I am unfamiliar with altair plotting, so maybe panel ordering is done automatically - just wanted to check.

altair won't mind if the rows in the input data frame aren't sorted in the order that the points will be left-to-right plotted

eschrom · 2025-02-04T15:06:59Z

scripts/proj_plot.py

+    obs: polars.Dataframe
+        The observed uptake data frame, indicating the cumulative vaccine uptake as of `time_end`.
+    pred: polars.Dataframe
+        The predicted daily uptake, differed by forecast date, must include columns `forecast_start` and `estimate`.


I would say, "The predicted cumulative uptake, ..." since the predictions taken here are cumulative, whereas daily average uptake is a separate that actually appears in some of the model fitting.

eschrom · 2025-02-04T15:07:00Z

scripts/proj_plot.py

+    pred: polars.Dataframe
+        The predicted daily uptake, differed by forecast date, must include columns `forecast_start` and `estimate`.
+    config: dict
+        config.yaml to define the number of columns in the graph, the saving path and the saving name.


The config doesn't actually have the path or name for saving, does it?

See above; agree I would drop config

No, this was my trial but it got deprecated. Thanks for catching that

eschrom · 2025-02-04T15:08:45Z

scripts/score_plot.py

+    scores: polars.DataFrame
+        The evaluation scores data frame.
+    config: dict
+        config.yaml to define the saving path and the saving name.


Again, the config doesn't really have the path or name for saving, right? I thought those were in the makefile. The config has the full names of the evaluation metrics, though.

See above; agree that I would drop config

Fixed. Thanks for catching

swo

My big things are:

Making sure that config is not passed to the postprocessing plot scripts
Simplifying the plots using existing facet or grid systems

I also don't think it's crazy to have a single postprocessing.py that plots the forecasts and the scores. Because at some point we'll also want some csv (or whatever) reports of some of these values. Rather than have a different script for each of them, we could just stick them all together. There aren't that many outputs being made, that this shouldn't be too slow? (Unless it is slow to make some of these plots, in which case, we should keep them separate!)

swo · 2025-02-05T14:47:32Z

scripts/config_template.yaml

I would put all of these options right in the scripts themselves

Like, the config is most important for articulating what analyses the pipeline is going to run, so the model names, model parameters, score funs, etc. are all important

But "plot: True" is something that should really be in the Makefile, and how many columns in the subplot is something we can just leave in the script and tweak there as needed

Ditto with the score function -> name dictionary

I'd rather just have this defined somewhere in the plotting script, because changing it shouldn't trigger a re-run of the upstream forecasts

I'm comparing against the flowchart in #61

I think this PR introduces dependencies from config.yaml to proj_plot.py and score_plot.py, when we don't really need those dependencies. I'd like to be able to iterate on the postprocessing without triggering a re-run of the upstream stuff

Actually, looking at Makefile, it looks like the config is not passed to these scripts

But "plot: True" is something that should really be in the Makefile, and how many columns in the subplot is something we can just leave in the script and tweak there as needed

Thanks for pointing out what are important for config file, which I found vague so far. How to put "plot: True" in Makefile to let users choose to plot or not? I will put the number of subplot columns in the script, and would like you to demo the plotting option in Makefile!

swo · 2025-02-05T14:52:27Z

scripts/proj_plot.py

+
+    forecast_starts = [date for date in pred["forecast_start"].unique()]
+
+    for forecast_start in forecast_starts:


Why not just use a grid chart with .encode(row=etc, column=etc) or a facet chart with .facet()?

See https://altair-viz.github.io/user_guide/compound_charts.html#repeated-charts for explanations of how to avoid constructing these kinds of grids manually

(I'm not even sure you need layers here, to do data and projection; you might be able to just unpivot and use .encode(color=)

swo · 2025-02-05T14:52:52Z

scripts/score_plot.py

+    scores: polars.DataFrame
+        The evaluation scores data frame.
+    config: dict
+        config.yaml to define the saving path and the saving name.


See above; agree that I would drop config

swo · 2025-02-05T14:53:06Z

scripts/proj_plot.py

+    pred: polars.Dataframe
+        The predicted daily uptake, differed by forecast date, must include columns `forecast_start` and `estimate`.
+    config: dict
+        config.yaml to define the number of columns in the graph, the saving path and the saving name.


See above; agree I would drop config

swo · 2025-02-05T14:55:29Z

Makefile

This looks right!

eschrom

I still have a few questions before this gets merged. It's getting close, though! I like that it's a single script for all evaluation plots now.

eschrom · 2025-02-10T23:40:40Z

docs/model_development.md

@@ -0,0 +1,161 @@
+# Purpose


Why does this doc show up as a new file as of this PR, when it has already been merged as part of #114? Just make sure to rebase before merging, in case this is an older version of this doc - wouldn't want to overwrite the merged version with an older version.

Sounds like a rebase is needed?

eschrom · 2025-02-10T23:46:28Z

scripts/postprocess.py

+    pred = pl.scan_parquet(args.pred).collect()
+    data = pl.scan_parquet(args.obs).collect()
+
+    if config["projection_plot"]["plot"]:


I don't see a section of the config labeled "projection_plot" or "score_plot," but I also don't see any updates to the config in this PR. Is an updated config missing from this PR?

That said, @swo had indicated that config entries to control which plots are generated was not a desirable strategy. How did that conversation resolve?

Maybe a good strategy is just to have this script do everything if config["evaluation_timeframe"]["interval"] is not None (i.e. if evaluation is being performed) and do nothing otherwise, since this is all focused on retrospective forecasting anyway.

Good catch! I removed the plotting option in config as suggested by @swo, but still keep the original in the plotting script. This can be good option, but that conversation hasn't been resolved, I'm curious how to change the Makefile to have the plotting option!

Sounds good to me. I suspect we'll eventually pull these arguments out of the config, but I don't think that has to gate us here. I am eager to run and debug the full pipeline. As long as this allows everything to run including plotting, I'm happy to address it later.

My point is to just always have the plots as part of the workflow make all, so you don't need a yes/no choice anywhere

you could make data/scores.parquet, if you wanted to run everything up through the scores but nothing after

make output/projections.png should run a command that actually causes output/projections.png to get made. So it's weird to call a script and give it an option to do nothing.

eschrom · 2025-02-10T23:48:35Z

scripts/postprocess.py

+    pred = pred.with_columns(
+        date_str=("Forecast Date:" + pl.col("forecast_start").cast(pl.Utf8))
+    )
+    obs_chart = (


Looks like you are overwriting what was done in lines 35-39?

Thanks! Just updated

eschrom · 2025-02-10T23:49:20Z

scripts/postprocess.py

+    )
+
+    pred_chart = (
+        alt.Chart()


Doesn't pred have to be given to alt.Chart()?

No, it seems like the way to have faceted plots in altair is to pass the data for faceting at alt.layer() later

Okay - good to know!

Fuhan-Yang · 2025-02-11T18:56:51Z

Just put the plotting option back in config, ensuring the current version is runnable

eschrom

As noted, I think there are still a few improvements to be made, but that goes for the whole pipeline and not just this PR. I am happy for you to merge. Then I'll start scouring for bugs!

eschrom · 2025-02-11T20:54:37Z

scripts/postprocess.py

+    )
+
+    pred_chart = (
+        alt.Chart()


Okay - good to know!

eschrom · 2025-02-11T20:58:43Z

scripts/postprocess.py

+    pred = pl.scan_parquet(args.pred).collect()
+    data = pl.scan_parquet(args.obs).collect()
+
+    if config["projection_plot"]["plot"]:


Sounds good to me. I suspect we'll eventually pull these arguments out of the config, but I don't think that has to gate us here. I am eager to run and debug the full pipeline. As long as this allows everything to run including plotting, I'm happy to address it later.

swo

Kindly rebase so we can double-check exactly what's in here

And then I made suggestions in #119 that I think should get incorporated here, or discussed as the next PR

swo · 2025-02-12T19:48:35Z

Makefile

+# $(SCORE_PLOTS): scripts/score_plot.py $(SCORES)
+# 	python $< --score=$(SCORES) --output=$@
+
+# $(PROJ_PLOTS): scripts/proj_plot.py $(FORECASTS)
+# 	python $< --pred=$(FORECASTS) --obs=$(RAW_DATA) --output=$@


We should just remove this old code!

swo · 2025-02-12T19:49:02Z

docs/model_development.md

@@ -0,0 +1,161 @@
+# Purpose


Sounds like a rebase is needed?

swo · 2025-02-12T19:50:34Z

scripts/config_template.yaml

+# Path to save the plots
+projection_plot:
+  plot: True
+
+score_plot:
+  plot: True


I don't think you need these options. The pipeline can just always produce these plots; that's part of the full pipeline.

Or, as I suggested before, we just do the postprocessing when config["evaluation_timeframe"]["interval"] is not None. This means we are doing retrospective forecasting with model evaluation, which is exactly when you want these postprocessing plots to be made.

Fuhan-Yang · 2025-02-14T19:04:15Z

It is rebased! Only commits related to this PR are included

- Makefile - Do not restart from preprocessing every time (replace `cache` with `$(NIS_CACHE)`) - Put in config and ensure it's a proper dependency - Remove commented lines - Break up long line - Update altair version: I was getting errors about a mismatch between the versions of Python, `typing_extensions`, and altair that was fixed (for me) by updating altair - Refactor the plotting script to use built in facetting functionality, rather than building up charts manually - Remove default config path from scripts; ensure they are put in via the Makefile

swo · 2025-02-18T16:34:15Z

After merging in #119 , I have most of the things I was hoping for!

Fuhan-Yang linked an issue Feb 3, 2025 that may be closed by this pull request

Diagnostic & EDA plots #108

Open

Fuhan-Yang removed a link to an issue Feb 3, 2025

Diagnostic & EDA plots #108

Open

Fuhan-Yang linked an issue Feb 3, 2025 that may be closed by this pull request

Plots for point estimates #116

Closed

Fuhan-Yang requested review from swo and eschrom February 3, 2025 16:51

Fuhan-Yang marked this pull request as ready for review February 3, 2025 16:52

eschrom requested changes Feb 4, 2025

View reviewed changes

swo requested changes Feb 5, 2025

View reviewed changes

Fuhan-Yang requested review from swo and eschrom February 7, 2025 20:03

eschrom requested changes Feb 10, 2025

View reviewed changes

eschrom approved these changes Feb 11, 2025

View reviewed changes

swo requested changes Feb 12, 2025

View reviewed changes

Fuhan-Yang added 11 commits February 14, 2025 14:00

plot projections and eval scores

5686d40

ruff

95c8a95

small fixes

905f0ac

remove redundant file

f764d72

clean up config

153d321

combine plot scripts

c1e9295

file reorg

dfdb36d

update Makefile for single postprocess script

a003e5b

ruff

a960ed7

fix and put plot option back in config

76e9ce8

update pytest.yaml

5f79a31

Fuhan-Yang force-pushed the fy_plot branch from 319ba70 to 5f79a31 Compare February 14, 2025 19:00

Fuhan-Yang requested a review from swo February 14, 2025 19:07

swo approved these changes Feb 18, 2025

View reviewed changes

Fuhan-Yang merged commit 70ab84a into main Feb 18, 2025
2 checks passed

Fuhan-Yang deleted the fy_plot branch February 18, 2025 16:37


		charts = []

		forecast_starts = [date for date in pred["forecast_start"].unique()]


		forecast_starts = [date for date in pred["forecast_start"].unique()]

		for forecast_start in forecast_starts:

Plot projections and eval scores #115

Plot projections and eval scores #115

Conversation

Fuhan-Yang commented Feb 3, 2025 • edited Loading

eschrom left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

swo left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eschrom left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Fuhan-Yang commented Feb 11, 2025

eschrom left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

swo left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Fuhan-Yang commented Feb 14, 2025

swo commented Feb 18, 2025

Fuhan-Yang commented Feb 3, 2025 •

edited

Loading