-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
calendar_average for CESM / CAM data #676
Comments
Thanks for reporting this and for the helpful examples! This seems like something we should be able to handle and may be easier now as well with some of the groupby enhancements in Xarray. |
Thanks again for the nice examples - these were super helpful! To reiterate, it seems like the
For the first case, I think handling this with pre-processing might make more sense than in geocat-comp since this is both a special case and not particularly intuitive. Admittedly, the pre-processing is a bit clunky (example below) because the calendar requires using cftime so Pandas timedeltas don't work here, but it's not terrible and and hopefully will become cleaner at some point (see: pydata/xarray#5687). Open to discussion here as well though. Example subtracting a month from the time index: For the second case, I think it's probably more reasonable to expect Let me know if there's anything here that sounds off and happy to chat too if that'd be easier. In the meantime, I'll start looking into adapting the climatology functions for non-uniform spacing. |
@kafitzgerald -- I agree. The offset time stamp should be dealt with separately. The way I do it is to change the time coordinate to be the average of the time bounds data. This use the CF-compliant metadata to put the time stamp at the middle of the averaging interval. A side effect of that is that the time stamps aren't evenly spaced. But this is probably not a super common situation, so users should deal with it separately. In terms of the non-uniform spacing, this is definitely not specific to CESM data. As far as I can tell, any monthly data will trigger this issue. For example, if the time stamp is always on the 15th of the month (or any other day) the times aren't evenly spaced. And I think since xarray decodes time coordinates into some kind of time stamp object, you can't have just "year-month" (maybe it is possible, but I'm not sure how to do it). |
Version
2024.4.0
How did you install geocat-comp?
conda-forge
Operating System
linux and MacOS
Summary
calendar_average does not calculate annual averages correctly from monthly data if the time coordinate is at the mid-month value.
When the times are at mid-month, the function fails because it thinks the times are not evenly spaced.
This is problematic because until recently, CESM (at least CAM) output has put the timestamp at the end of the averaging interval, meaning that monthly averages were timestamped in the following month... e.g., a January average is timestamped as midnight on 1 February. A common way to correct for this is to use the time bounds to get the center of the averaging interval, but that ends up moving around within months.
Also note that development CESM versions have changed the behavior to make the timestamp be at the center point of the averaging interval, so this becomes an issue for both old and new versions of CESM.
If I do not correct the time coordinate, then GeoCAT will try to do the annual average, but the result is wrong because all the months are shifted by one. (See example below.)
Expected behavior
When the data is monthly, the function should not care if the time coordinate is evenly spaced (but still should be weighted by days-in-month).
Steps to reproduce
(See /glade/u/home/brianpm/Code/gc_calendar.ipynb for this example.)
Relevant log output
Environment
The text was updated successfully, but these errors were encountered: