Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plot projections and eval scores #115

Merged
merged 12 commits into from
Feb 18, 2025
Merged

Plot projections and eval scores #115

merged 12 commits into from
Feb 18, 2025

Conversation

Fuhan-Yang
Copy link
Contributor

@Fuhan-Yang Fuhan-Yang commented Feb 3, 2025

Add diagnostic plots of time-varying projections compared with data, and evaluation scores differed by forecast dates. Minor bugs showed up when using different forecast periods. Worth to try multiple forecast periods to ensure the robustness of the pipeline.

@Fuhan-Yang Fuhan-Yang linked an issue Feb 3, 2025 that may be closed by this pull request
@Fuhan-Yang Fuhan-Yang removed a link to an issue Feb 3, 2025
@Fuhan-Yang Fuhan-Yang linked an issue Feb 3, 2025 that may be closed by this pull request
@Fuhan-Yang Fuhan-Yang requested review from swo and eschrom February 3, 2025 16:51
@Fuhan-Yang Fuhan-Yang marked this pull request as ready for review February 3, 2025 16:52
Copy link
Collaborator

@eschrom eschrom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work, @Fuhan-Yang! I apologize that I'm not well-versed in altair, so I cannot review the actual plotting with much detail. But I did find one spot that I think needs a change - the forecasts are saved in long, not wide, format now. Otherwise, there are just some questions peppered throughout.

data = pl.scan_parquet(args.obs).collect()

# Drop all samples and just use mean estimate, for now
pred = pred.drop([col for col in pred.columns if "estimate_" in col])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line expects the forecasts in wide format, but they are now in long format. See Line ~592 in iup/models.py, which returns predictions from a model as SampleForecasts (with sample_id and estimate columns), and then see Line ~85 in scripts/forecast.py, which averages over samples to get the mean of the posterior predictive distribution at each time_end before saving the forecasts.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forget if this was being discussed elsewhere, but I have a strong preference for long format for these things

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was, and we fixed it!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching that! This line may adapt from previous version and didn't raise error because the condition didn't meet. We indeed switched to long format, and just average over for now. This line is deleted.


charts = []

forecast_starts = [date for date in pred["forecast_start"].unique()]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we check that the pred frame is already sorted by forecast_start to make sure that the plot panels come out in a sensible order? I am unfamiliar with altair plotting, so maybe panel ordering is done automatically - just wanted to check.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

altair won't mind if the rows in the input data frame aren't sorted in the order that the points will be left-to-right plotted

obs: polars.Dataframe
The observed uptake data frame, indicating the cumulative vaccine uptake as of `time_end`.
pred: polars.Dataframe
The predicted daily uptake, differed by forecast date, must include columns `forecast_start` and `estimate`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say, "The predicted cumulative uptake, ..." since the predictions taken here are cumulative, whereas daily average uptake is a separate that actually appears in some of the model fitting.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed!

pred: polars.Dataframe
The predicted daily uptake, differed by forecast date, must include columns `forecast_start` and `estimate`.
config: dict
config.yaml to define the number of columns in the graph, the saving path and the saving name.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The config doesn't actually have the path or name for saving, does it?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above; agree I would drop config

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this was my trial but it got deprecated. Thanks for catching that

scores: polars.DataFrame
The evaluation scores data frame.
config: dict
config.yaml to define the saving path and the saving name.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, the config doesn't really have the path or name for saving, right? I thought those were in the makefile. The config has the full names of the evaluation metrics, though.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above; agree that I would drop config

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Thanks for catching

Copy link
Collaborator

@swo swo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My big things are:

  • Making sure that config is not passed to the postprocessing plot scripts
  • Simplifying the plots using existing facet or grid systems

I also don't think it's crazy to have a single postprocessing.py that plots the forecasts and the scores. Because at some point we'll also want some csv (or whatever) reports of some of these values. Rather than have a different script for each of them, we could just stick them all together. There aren't that many outputs being made, that this shouldn't be too slow? (Unless it is slow to make some of these plots, in which case, we should keep them separate!)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would put all of these options right in the scripts themselves

Like, the config is most important for articulating what analyses the pipeline is going to run, so the model names, model parameters, score funs, etc. are all important

But "plot: True" is something that should really be in the Makefile, and how many columns in the subplot is something we can just leave in the script and tweak there as needed

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto with the score function -> name dictionary

I'd rather just have this defined somewhere in the plotting script, because changing it shouldn't trigger a re-run of the upstream forecasts

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm comparing against the flowchart in #61

I think this PR introduces dependencies from config.yaml to proj_plot.py and score_plot.py, when we don't really need those dependencies. I'd like to be able to iterate on the postprocessing without triggering a re-run of the upstream stuff

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, looking at Makefile, it looks like the config is not passed to these scripts

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But "plot: True" is something that should really be in the Makefile, and how many columns in the subplot is something we can just leave in the script and tweak there as needed

Thanks for pointing out what are important for config file, which I found vague so far. How to put "plot: True" in Makefile to let users choose to plot or not? I will put the number of subplot columns in the script, and would like you to demo the plotting option in Makefile!


forecast_starts = [date for date in pred["forecast_start"].unique()]

for forecast_start in forecast_starts:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just use a grid chart with .encode(row=etc, column=etc) or a facet chart with .facet()?

See https://altair-viz.github.io/user_guide/compound_charts.html#repeated-charts for explanations of how to avoid constructing these kinds of grids manually

(I'm not even sure you need layers here, to do data and projection; you might be able to just unpivot and use .encode(color=)

scores: polars.DataFrame
The evaluation scores data frame.
config: dict
config.yaml to define the saving path and the saving name.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above; agree that I would drop config

pred: polars.Dataframe
The predicted daily uptake, differed by forecast date, must include columns `forecast_start` and `estimate`.
config: dict
config.yaml to define the number of columns in the graph, the saving path and the saving name.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above; agree I would drop config

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks right!

@Fuhan-Yang Fuhan-Yang requested review from swo and eschrom February 7, 2025 20:03
Copy link
Collaborator

@eschrom eschrom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still have a few questions before this gets merged. It's getting close, though! I like that it's a single script for all evaluation plots now.

@@ -0,0 +1,161 @@
# Purpose
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this doc show up as a new file as of this PR, when it has already been merged as part of #114? Just make sure to rebase before merging, in case this is an older version of this doc - wouldn't want to overwrite the merged version with an older version.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like a rebase is needed?

pred = pl.scan_parquet(args.pred).collect()
data = pl.scan_parquet(args.obs).collect()

if config["projection_plot"]["plot"]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see a section of the config labeled "projection_plot" or "score_plot," but I also don't see any updates to the config in this PR. Is an updated config missing from this PR?

That said, @swo had indicated that config entries to control which plots are generated was not a desirable strategy. How did that conversation resolve?

Maybe a good strategy is just to have this script do everything if config["evaluation_timeframe"]["interval"] is not None (i.e. if evaluation is being performed) and do nothing otherwise, since this is all focused on retrospective forecasting anyway.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! I removed the plotting option in config as suggested by @swo, but still keep the original in the plotting script. This can be good option, but that conversation hasn't been resolved, I'm curious how to change the Makefile to have the plotting option!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me. I suspect we'll eventually pull these arguments out of the config, but I don't think that has to gate us here. I am eager to run and debug the full pipeline. As long as this allows everything to run including plotting, I'm happy to address it later.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My point is to just always have the plots as part of the workflow make all, so you don't need a yes/no choice anywhere

you could make data/scores.parquet, if you wanted to run everything up through the scores but nothing after

make output/projections.png should run a command that actually causes output/projections.png to get made. So it's weird to call a script and give it an option to do nothing.

pred = pred.with_columns(
date_str=("Forecast Date:" + pl.col("forecast_start").cast(pl.Utf8))
)
obs_chart = (
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like you are overwriting what was done in lines 35-39?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Just updated

)

pred_chart = (
alt.Chart()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't pred have to be given to alt.Chart()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it seems like the way to have faceted plots in altair is to pass the data for faceting at alt.layer() later

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay - good to know!

@Fuhan-Yang
Copy link
Contributor Author

Just put the plotting option back in config, ensuring the current version is runnable

Copy link
Collaborator

@eschrom eschrom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As noted, I think there are still a few improvements to be made, but that goes for the whole pipeline and not just this PR. I am happy for you to merge. Then I'll start scouring for bugs!

)

pred_chart = (
alt.Chart()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay - good to know!

pred = pl.scan_parquet(args.pred).collect()
data = pl.scan_parquet(args.obs).collect()

if config["projection_plot"]["plot"]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me. I suspect we'll eventually pull these arguments out of the config, but I don't think that has to gate us here. I am eager to run and debug the full pipeline. As long as this allows everything to run including plotting, I'm happy to address it later.

Copy link
Collaborator

@swo swo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kindly rebase so we can double-check exactly what's in here

And then I made suggestions in #119 that I think should get incorporated here, or discussed as the next PR

Makefile Outdated
Comment on lines 18 to 22
# $(SCORE_PLOTS): scripts/score_plot.py $(SCORES)
# python $< --score=$(SCORES) --output=$@

# $(PROJ_PLOTS): scripts/proj_plot.py $(FORECASTS)
# python $< --pred=$(FORECASTS) --obs=$(RAW_DATA) --output=$@
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should just remove this old code!

@@ -0,0 +1,161 @@
# Purpose
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like a rebase is needed?

Comment on lines 52 to 57
# Path to save the plots
projection_plot:
plot: True

score_plot:
plot: True
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you need these options. The pipeline can just always produce these plots; that's part of the full pipeline.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or, as I suggested before, we just do the postprocessing when config["evaluation_timeframe"]["interval"] is not None. This means we are doing retrospective forecasting with model evaluation, which is exactly when you want these postprocessing plots to be made.

@Fuhan-Yang
Copy link
Contributor Author

It is rebased! Only commits related to this PR are included

@Fuhan-Yang Fuhan-Yang requested a review from swo February 14, 2025 19:07
- Makefile
- Do not restart from preprocessing every time (replace `cache` with
`$(NIS_CACHE)`)
  - Put in config and ensure it's a proper dependency
  - Remove commented lines
  - Break up long line
- Update altair version: I was getting errors about a mismatch between
the versions of Python, `typing_extensions`, and altair that was fixed
(for me) by updating altair
- Refactor the plotting script to use built in facetting functionality,
rather than building up charts manually
- Remove default config path from scripts; ensure they are put in via
the Makefile
@swo
Copy link
Collaborator

swo commented Feb 18, 2025

After merging in #119 , I have most of the things I was hoping for!

@Fuhan-Yang Fuhan-Yang merged commit 70ab84a into main Feb 18, 2025
2 checks passed
@Fuhan-Yang Fuhan-Yang deleted the fy_plot branch February 18, 2025 16:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Plots for point estimates
3 participants