-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Temporary workaround for pandas and xarray frequency changes #1540
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well played. What are the chances that we could push the freq fixing stuff upstream to xarray
? This change must be affecting tons of users in addition to us.
def _get_freq_version(version: str | None = None) -> str: | ||
if version is None: | ||
if Version("2023.11.0") <= Version(xr.__version__) and Version( | ||
"2.2" | ||
) <= Version(pd.__version__): | ||
version = "new" | ||
elif Version("2023.11.0") <= Version(xr.__version__) or Version( | ||
"2.2" | ||
) <= Version(pd.__version__): | ||
version = "inter" | ||
else: | ||
version = "old" | ||
return version |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the sake of not having version
written 20 times, I would substitute it for convention
.
base = base[:-1] | ||
if base not in "YQAM": | ||
raise ValueError( | ||
f"Invalid frequency: {freq}. Base {base[:-1]} can't have an start (S) or end (E) anchor." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
f"Invalid frequency: {freq}. Base {base[:-1]} can't have an start (S) or end (E) anchor." | |
f"Invalid frequency: {freq}. Base {base[:-1]} cannot have a start (S) or end (E) anchor." |
anchor is not None and anchor not in _MONTH_ABBREVIATIONS.values() | ||
): | ||
raise ValueError( | ||
f"Invalid frequency: {freq}, not fit for usage with xclim, xarray and cftime. " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
f"Invalid frequency: {freq}, not fit for usage with xclim, xarray and cftime. " | |
f"Invalid frequency: `{freq}` not fit for usage with installed xclim, xarray, and cftime. " |
base: str, | ||
start_anchored: bool, | ||
anchor: str | None, | ||
version: str | None = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
version: str | None = None, | |
convention: str | None = None, |
} | ||
|
||
|
||
# TODO: This can be removed when we pin pandas >= 2.2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can integrate a small helper function here to check what the required pins are?
Something like:
from importlib.metadata import distribution
for req in distribution("xclim").requires:
if req.split(";")[0].startswith("pandas>=2.2"):
raise FixMeOrSomethingPleaseException()
Closed as this seems to complicated. A drastic change in the pins seems cleaner. |
Pull Request Checklist:
number
) and pull request (:pull:number
) has been addedThis PR became a bit more complicated when I realized that xarray's latest release implements the "new" standards, but pandas 2.2 has not yet been released. And some of those changes are not implemented at all yet in pandas1. I.e.
pd.tseries.frequencies.to_offset(xr.infer_freq(da))
sometimes raises an error.What kind of change does this PR introduce?
fix_freq
: it takes in a frequency code and translates it to the appropriate syntax. There are, sadly, three cases:By « syntax » I mean the default codes. The "inter" version is supported by all previous xarray versions and pandas >= 1.0.
All hard-coded frequencies have been adapted for the "inter" syntax. (in the code, tests and notebooks)
fix_freq
has been added everywhere conflict was possibleRaised the minimum pin on pandas so we can at least use the "inter" codes and not have a fourth category.
My goal here was to reduce the number of warnings for users. I wanted to avoid xclim's default to trigger user warnings or even errors. I don't think I achieved that goal, but I hope the transition will be smoother nonetheless.
Does this PR introduce a breaking change?
For sure. I think a normal "beginner" usage of xclim would be unchanged, but I'm not so sure about things like
xscen
or custom indicator modules. But some of that is broken anyway with xarray 2023.11.0 and would further break with pandas 2.2. Hopefully, we are able to control the damage with this PR and makes minimum breaking changes here only.Other information:
The long term goal is to remove
fix_freq
when we pin pandas >= 2.2. This PR also introduces a few workarounds to remove at the moment, all denoted with "FIXME" comments.Major differences
Footnotes
and I was a reviewer on that PR in xarray. Woopsie! ↩
The "new" syntax is not yet supported by pandas (< 2.2), but xarray (>= 2023.11) will warn and return that in
xr.infer_freq
. ↩