-
-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add sample saving for Site Dataset + presaved SiteDataset/DataModule for loading presaved samples #290
Conversation
for more information, see https://pre-commit.ci
…x/PVNet into site-data-sampling
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code looks all good. I wonder if we should od https://github.com/openclimatefix/ocf-data-sampler/issues/99 first, then add some test cases in here, just to be double sure
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## dev-data-sampler #290 +/- ##
===================================================
Coverage ? 59.67%
===================================================
Files ? 28
Lines ? 1860
Branches ? 0
===================================================
Hits ? 1110
Misses ? 750
Partials ? 0 ☔ View full report in Codecov by Sentry. |
@@ -6,7 +6,7 @@ dynamic = ["version", "readme"] | |||
license={file="LICENCE"} | |||
|
|||
dependencies = [ | |||
"ocf_data_sampler==0.0.32", | |||
"ocf_data_sampler==0.0.43", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note this update in ocf-data-sampler
version will break things from working properly on this dev branch for a bit before we get to fixing/updating everything.
I see two options:
- We don't merge this in until we get everything working, pros is that we don't need another branch, con is this PR will get massive (already fairly big)
- Perhaps can use another dev branch, the original is kept as the working dev branch for ocf-data-sampler in PVNet and the new one is the one which is the interim one whilst we get all things working. Pros and cons basically the opposite of 1.
The reason for the caution is if we were to merge this in as is I think we lose any remote code which allows ocf-data-sampler to work with PVNet which feels risky?
@peterdudfield @dfulu let me know what you think, may be other options too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or I am being too paranoid and it is fine to merge this in as we have local copies of code that work (plus would be able to pull out working code from git history)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, yea I can see the decision here. Get the balance right between small PRs, and getting things working properly. I wouldnt commmit to main
, but i know this is only dev
.
What is there to do in this bit of work to make sure it is working? Do you think it is just updating from BatchKeys to SampleKeys, a bit more?
Without knowing this: I would probably still merge, and then fix afterwards. Its not the main branch, and people can pull a different commit if they need to. It also allows someone to fix issue X, why you doing issue Y.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah it's mainly moving away from the old BatchKey
enum to the SampleKey
s we use now I believe, it's used in a few places in PVNet
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And the tests need updating because they have test data in the old BatchKey
format
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea, i think make a github issue with these things, and merge this one
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you happy to approve? No worries if you haven't had a chance to review yet
Pull Request
Tackles #301
Description
Inclues:
ocf-data-sampler
Site
torch datasetocf-data-sampler
How Has This Been Tested?
Both the save_samples script and the loading premade samples have been tested locally (further changes required to actually get model training fully without errors)
Also added tests back in with new SiteDatamodule.
TODO Testing