Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

recreate perfmetrics portrait plot in python #3551

Draft
wants to merge 32 commits into
base: main
Choose a base branch
from
Draft

Conversation

lukruh
Copy link
Contributor

@lukruh lukruh commented Mar 20, 2024

Description


Before you get started

Checklist

It is the responsibility of the author to make sure the pull request is ready to review. The icons indicate whether the item will be subject to the 🛠 Technical or 🧪 Scientific review.

New or updated recipe/diagnostic

New or updated data reformatting script


To help with the number of pull requests:

@valeriupredoi
Copy link
Contributor

what a noble venture, Lukas! Godspeed 🏁 🍺

@lukruh lukruh added this to the v2.11.0 milestone Apr 16, 2024
Copy link
Contributor

@schlunma schlunma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple of more comments of things I found while running an example recipe. Apart from these small issues, the diagnostic worked really nicely and the plot looks awesome 🚀

esmvaltool/diag_scripts/perfmetrics/portrait_plot.py Outdated Show resolved Hide resolved
Comment on lines +450 to +452
dataset[cfg["x_by"]].str.lower(), dataset[cfg["y_by"]].str.lower(),
dataset[cfg["group_by"]].str.lower(),
dataset[cfg["split_by"]].str.lower()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can lead to errors with any of the facets is None. For example, the default value if no split is given for a dataset is None.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that actually happened to me as well. I worked around it by setting everything that is None to 'None', but that is probably not the best solution ;).

esmvaltool/diag_scripts/perfmetrics/portrait_plot.py Outdated Show resolved Hide resolved
esmvaltool/diag_scripts/perfmetrics/portrait_plot.py Outdated Show resolved Hide resolved
log.warning("Metadata found for %s", selection)
das = xr.open_dataset(metas[0]["filename"])
varname = list(das.data_vars.keys())[0]
return das[varname].values.item()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be great to raise a nicer error here if the data is non-scalar, at the moment this gives ValueError: can only convert an array of size 1 to a Python scalar.

Copy link
Contributor Author

@lukruh lukruh Nov 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like "Expected scalar in input file {filename}"?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good!

@lukruh
Copy link
Contributor Author

lukruh commented May 2, 2024

So far we used the distance_metrics prerpocessor from within the recipe, which works but has some limitations in practise and can lead to messy recipes. I added an optional function to apply it from within the plot diagnostic. I.e. use reference_metric: rmse in the script section of the recipe instead of the distance_metric preprocessor. The reference_for_metric facet for the reference dataset i.e. ERA-interim is still required. @diegokam if you are available, you could have a look and try to adjust the recipe accordingly.

Comment on lines +216 to +219
if None in data.coords[cfg["split_by"]].values:
cfg.update({"default_split": None})
else:
cfg.update({"default_split": data.coords[cfg["split_by"]].values[0]})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't this overwrite a user-defined default_split? Do we want this?

@mo-gill
Copy link
Contributor

mo-gill commented May 2, 2024

Hi, we are currently working on the ESMValTool release for v2.11.0. We're wondering if you'd be able to complete this PR by the end of next week (Friday 10th May).

Otherwise, please let us know, and we'll move it into the next milestone for you 🙂

@lukruh
Copy link
Contributor Author

lukruh commented May 3, 2024

I don't see this PR be ready until next week. The next milestone is more realistic I'd say.

except IndexError as exc:
raise IndexError("No reference found for metric.") from exc
ref_cube = iris.load_cube(reference["filename"])
for meta in y_metas:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should support multiple references (with different split facets). The split (maybe the full set of split/group/x/y) needs also to be part of the filename to avoid overwriting files.

@mo-gill mo-gill modified the milestones: v2.11.0, v2.12.0 May 24, 2024
@rbeucher
Copy link
Contributor

I just came across this. It looks great. Any progress?

@schlunma
Copy link
Contributor

I just came across this. It looks great. Any progress?

We are working on it 😄 It's planned to include this in v2.12

@lukruh lukruh changed the title recreate perfmetric plots in python recreate perfmetrics portrait plot in python Nov 6, 2024
@lukruh
Copy link
Contributor Author

lukruh commented Nov 6, 2024

Its also possible know to apply MultiModel Mean and Median over each group (i.e. project). I added a simple recipe recipe_portrait_CMIP.yml with a few models and variables as example for the diagnostic. It runs faster and is less complex than the CMIP5 one. As discussed with @bettina-gier, the long recipe to reproduce the output of the NCL diagnostic will be removed before merge. It's attached here in case someone needs it after the branch is deleted.
recipe_perfmetrics_CMIP5_py.yml.txt

@axel-lauer
Copy link
Contributor

I am not in favor of removing the old recipe as long as we do not reproduce the full output (e.g. including Taylor diagrams, etc.). and therefore have a new and complete "perfmetrics". So long, this PR just includes the portrait diagram. For this reason and to avoid confusion between "perfmetrics" and "portrait diagram", the new recipe (and documentation) should be simply called e.g. "recipe_portrait_plot.yml" and not contain anything like "perfmetrics" in the filenames.

As a side note, I am not sure helper scripts such as "compare_netCDFs.py" need to go into the "diagnostics" folder. Maybe they do not need to go into the repository at all?

@axel-lauer
Copy link
Contributor

I just talked to @bettina-gier and I think I misunderstood the removal of the recipe. So this is fine. My points about the naming remain.

@lukruh
Copy link
Contributor Author

lukruh commented Nov 7, 2024

Sorry for the confusion. I didn't plan to remove the NCL based recipe just the one that reproduces its results. The helper scripts and different recipe versions will be removed before merge. Should we keep the script and the results as comments in this PR, in case some one is interested?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Recreate portrait plot from NCL perfmetrics in python
7 participants