Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow tz-aware timstamps in gaps.trim #206

Merged
merged 6 commits into from
Feb 9, 2024
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions docs/whatsnew/0.2.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,8 @@ Bug Fixes
of raising ``AttributeError``. (:pull:`181`)
* Compatibility with pandas 2.0.0 (:pull:`185`)
* Compatibility with scipy 1.11 (:pull:`196`)
* Updated function :py:func:`~pvanalytics.quality.gaps.trim` to handle pandas 2.0.0 update for tz-aware timeseries
(:pull:`206`)

Requirements
~~~~~~~~~~~~
Expand All @@ -62,3 +64,4 @@ Contributors
* Kevin Anderson (:ghuser:`kanderso-nrel`)
* Cliff Hansen (:ghuser:`cwhanse`)
* Abhishek Parikh (:ghuser:`abhisheksparikh`)
* Quyen Nguyen (:ghuser:`qnguyen345`)
5 changes: 4 additions & 1 deletion pvanalytics/quality/gaps.py
Original file line number Diff line number Diff line change
Expand Up @@ -409,8 +409,11 @@ def trim(series, days=10):
"""
start, end = start_stop_dates(series, days=days)
mask = pd.Series(False, index=series.index)

if start:
mask.loc[start.date():end.date()] = True
mask.loc[pd.to_datetime(start.date()).tz_localize(series.index.tz):
pd.to_datetime(end.date()).tz_localize(series.index.tz)] = \
True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return values of start_stop_dates seem to always have timestamps at midnight. With that in mind, do we need to do any conversion like .date() here? Couldn't we just use start and end as-is?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right mask.loc[start:end] = True would also works since adding .date() would change start and end from tz-aware to tz-naive, which made the Pandas error. I edited the code and have pushed your suggestion.

return mask


Expand Down
21 changes: 21 additions & 0 deletions pvanalytics/tests/quality/test_gaps.py
Original file line number Diff line number Diff line change
Expand Up @@ -449,6 +449,27 @@ def test_trim_daily_index():
)


def test_trim_daily_index_tz_aware():
"""trim works when data has a daily index and data is tz-aware."""
data = pd.Series(True, index=pd.date_range(
start='1/1/2020', end='2/29/2020', freq='D', tz="Etc/GMT+8"))
assert gaps.trim(data).all()
data.iloc[0:8] = False
data.iloc[9] = False
expected = data.copy()
expected.iloc[0:10] = False
assert_series_equal(
expected,
gaps.trim(data)
)
data.iloc[-5:] = False
expected.iloc[-5:] = False
assert_series_equal(
expected,
gaps.trim(data)
)


def test_completeness_score_all_nans():
"""A data set with all nans has completeness 0 for each day."""
completeness = gaps.completeness_score(
Expand Down
Loading