Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RasterDataset SITS support #2308

Draft
wants to merge 13 commits into
base: main
Choose a base branch
from
Draft

Conversation

sfalkena
Copy link
Contributor

First steps towards Satellite Imagery TimeSeries support within TorchGeo (following #640). Right now I have started with the RasterDataset, but I think my methods can be applied to other GeoDatasets too. I have created a sits_dataset notebook, so feel free to run and play with an example that I prepared.

Notes:

  • This approach expects the filename_glob to be generic enough to pick up on all bands when dealing with separate_files=True, so that all files end up in the index. This would lead to the same sampler issues from GeoDataset: ignore other bands for separate files #2222. If we agree with the general setup of this PR, I will create a separate PR for those sampler fixes. (I have this internally)
  • Currently whenever return_as_ts=False shape [c,h,w] is returned, and when return_as_ts=True shape [t,c,h,w] is returned. I personally would prefer to move to the [t,c,h,w] standard, and setting t=1 for non-ts data.
  • Next to the SITS, the __get_item__ also returns the individual dates as an array, so that any model needing the individual dates for the sample can use it.
  • Currently, samplers will take the mint and maxt from the sampler index for timeseries, so that the temporal aspect of the ROI could also limit the temporal aspect. By smartly combining multiple roi's we could then make timeseries for every combination of dates I think.
  • IntersectionDataset and UnionDataset take the return_as_ts property based on their input datasets properties.

I have currently not created any unit tests, will do so once we agree on the best structure.

@github-actions github-actions bot added documentation Improvements or additions to documentation datasets Geospatial or benchmark datasets testing Continuous integration testing samplers Samplers for indexing datasets and removed testing Continuous integration testing labels Sep 20, 2024
@sfalkena
Copy link
Contributor Author

Asking explicit feedback from @nilsleh @adamjstewart and @hfangcat

@github-actions github-actions bot added testing Continuous integration testing datamodules PyTorch Lightning datamodules labels Sep 23, 2024
@adamjstewart
Copy link
Collaborator

I need to find time to sit down and brainstorm this more. May include you and a few others in the discussion. I think the changes proposed here are necessary for what I'm envisioning, but I want to finalize the rest of the picture before I make you do more work. I appreciate your eagerness to work on this, but don't want you to waste too much time on one particular direction before we make a decision.

@nilsleh
Copy link
Collaborator

nilsleh commented Oct 1, 2024

@sfalkena Thanks for the contribution and jumping into the effort for time-series support. I am not sure whether you already saw #877 , where we took a stab at this a while ago (and has unfortunately been stale for a while). But I'd be happy to jump in again and brainstorm a good approach for time-series support. In that PR we continued the discussion a bit about possible different use-cases, modalities, and modeling setups worthy of considering, so I'd be interested in what you think about those, just as possible further ideas to consider.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
datamodules PyTorch Lightning datamodules datasets Geospatial or benchmark datasets documentation Improvements or additions to documentation samplers Samplers for indexing datasets testing Continuous integration testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants