README.md benchmark dataset code #2069

douglasmacdonald · 2024-05-19T10:56:58Z

Issue

I need help getting the example code on the README.md to work. I am now concentrating on the Benchmark datasets (https://github.com/microsoft/torchgeo?tab=readme-ov-file#benchmark-datasets).

I am running on the Planetary Computer platform.

I did not have any luck with the platform's default torchgeo and so run

!pip install torchgeo --upgrade

And this gives me version '0.5.2'.

However, I am still having problems....

dataset = VHR10('data', download=True, checksum=True)

RuntimeError: The MD5 checksum of the download file data[/NWPU](https://pccompute.westeurope.cloudapp.azure.com/NWPU) VHR-10 dataset.rar does not match the one on record.

from torchgeo.datamodules.utils import collate_fn_detection

ImportError: cannot import name 'collate_fn_detection' from 'torchgeo.datamodules.utils' ([/srv/conda/envs/notebook/lib/python3.11/site-packages/torchgeo/datamodules/utils.py](https://pccompute.westeurope.cloudapp.azure.com/srv/conda/envs/notebook/lib/python3.11/site-packages/torchgeo/datamodules/utils.py))

Fix

I assume version problems.

The text was updated successfully, but these errors were encountered:

adamjstewart · 2024-05-19T13:00:22Z

Hi @douglasmacdonald, sorry you ran into these issues!

I did not have any luck with the platform's default torchgeo

@calebrob6 do we have any contacts we can use to upgrade the default torchgeo version on PC?

RuntimeError: The MD5 checksum of the download file data/NWPU VHR-10 dataset.rar does not match the one on record.

I am not able to reproduce this. What version of torchvision are you using? TorchGeo uses torchvision download utils, and torchvision 0.17.1+ switched from requests to gdown for all Google Drive downloads. It may resolve the issue if you delete the file, upgrade to torchvision 0.17.1+, and install gdown.

ImportError: cannot import name 'collate_fn_detection' from 'torchgeo.datamodules.utils'

This is indeed a version issue. The feature you are trying to use was added in #1082 and will be included in the 0.6.0 release.

My personal recommendation would be to pick a different dataset, VHR-10 is actually one of the more complicated ones. If you're completely new to PyTorch, you're actually better off starting with a torchvision tutorial. All TorchGeo NonGeoDatasets are designed to be functionally identical to torchvision datasets. So if you know how to use torchvision, you know how to use torchgeo. If you still want to use VHR-10, either install the development version (0.6.0.dev0) or wait for the 0.6.0 release (maybe in 1 month?).

douglasmacdonald · 2024-05-19T13:14:49Z

Hello,

One moment!

Could it have anything to do with using:

pip install torchgeo==0.5.2

Where I maybe should be using

pip install torchgeo[all]==0.5.2

?

Best,
Douglas

adamjstewart · 2024-05-19T13:30:45Z

VHR-10 requires 3 optional dependencies:

gdown: to download the dataset if using torchvision 0.17.1+
rarfile: to extract the .rar file
pycocotools: to read the labels for the 'positive' set

Running pip install torchgeo will not install any of these. Running pip install torchgeo[datasets] will install 2 and 3. Torchvision does not currently automatically install 1 for you, so you have to install it yourself.

We may want to add 1 to [datasets]. If you want to submit a PR to do this, I would be happy to review it.

EDIT: I opened pytorch/vision#8430 to help better document this. With this, we could use torchvision in our required deps and torchvision[gdown] in our optional deps.

isaaccorley · 2024-05-19T14:49:19Z

Should we also change the example to use a different dataset like InriaAIL or EuroSAT?

adamjstewart · 2024-05-19T15:52:38Z

The example is fixed (it should work now on main), but I would be happy to change to a different dataset too. I only used that example because VHR-10 was the first dataset I wrote and it had a cool prediction plot.

robmarkcole · 2024-05-24T10:38:45Z

I've just attempted with torchgeo.__version__ == '0.6.0.dev0' and get a different error:

from torch.utils.data import DataLoader

from torchgeo.datamodules.utils import collate_fn_detection
from torchgeo.datasets import VHR10

# Initialize the dataset
dataset = VHR10(download=True)

---------------------------------------------------------------------------
NotRarFile                                Traceback (most recent call last)
Cell In[2], [line 7](vscode-notebook-cell:?execution_count=2&line=7)
      [4](vscode-notebook-cell:?execution_count=2&line=4) from torchgeo.datasets import VHR10
      [6](vscode-notebook-cell:?execution_count=2&line=6) # Initialize the dataset
----> [7](vscode-notebook-cell:?execution_count=2&line=7) dataset = VHR10(download=True)

File [/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/datasets/vhr10.py:218](https://vscode-remote+vscode-002d01hkcwhn6mva5nchdpjsd1bsb3-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/datasets/vhr10.py:218), in VHR10.__init__(self, root, split, transforms, download, checksum)
    [215](https://vscode-remote+vscode-002d01hkcwhn6mva5nchdpjsd1bsb3-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/datasets/vhr10.py:215) self.checksum = checksum
    [217](https://vscode-remote+vscode-002d01hkcwhn6mva5nchdpjsd1bsb3-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/datasets/vhr10.py:217) if download:
--> [218](https://vscode-remote+vscode-002d01hkcwhn6mva5nchdpjsd1bsb3-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/datasets/vhr10.py:218)     self._download()
    [220](https://vscode-remote+vscode-002d01hkcwhn6mva5nchdpjsd1bsb3-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/datasets/vhr10.py:220) if not self._check_integrity():
    [221](https://vscode-remote+vscode-002d01hkcwhn6mva5nchdpjsd1bsb3-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/datasets/vhr10.py:221)     raise DatasetNotFoundError(self)

File [/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/datasets/vhr10.py:343](https://vscode-remote+vscode-002d01hkcwhn6mva5nchdpjsd1bsb3-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/datasets/vhr10.py:343), in VHR10._download(self)
    [340](https://vscode-remote+vscode-002d01hkcwhn6mva5nchdpjsd1bsb3-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/datasets/vhr10.py:340)     return
    [342](https://vscode-remote+vscode-002d01hkcwhn6mva5nchdpjsd1bsb3-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/datasets/vhr10.py:342) # Download images
--> [343](https://vscode-remote+vscode-002d01hkcwhn6mva5nchdpjsd1bsb3-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/datasets/vhr10.py:343) download_and_extract_archive(
    [344](https://vscode-remote+vscode-002d01hkcwhn6mva5nchdpjsd1bsb3-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/datasets/vhr10.py:344)     self.image_meta['url'],
    [345](https://vscode-remote+vscode-002d01hkcwhn6mva5nchdpjsd1bsb3-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/datasets/vhr10.py:345)     self.root,
    [346](https://vscode-remote+vscode-002d01hkcwhn6mva5nchdpjsd1bsb3-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/datasets/vhr10.py:346)     filename=self.image_meta['filename'],
    [347](https://vscode-remote+vscode-002d01hkcwhn6mva5nchdpjsd1bsb3-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/datasets/vhr10.py:347)     md5=self.image_meta['md5'] if self.checksum else None,
    [348](https://vscode-remote+vscode-002d01hkcwhn6mva5nchdpjsd1bsb3-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/datasets/vhr10.py:348) )
    [350](https://vscode-remote+vscode-002d01hkcwhn6mva5nchdpjsd1bsb3-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/datasets/vhr10.py:350) # Annotations only needed for "positive" image set
    [351](https://vscode-remote+vscode-002d01hkcwhn6mva5nchdpjsd1bsb3-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/datasets/vhr10.py:351) if self.split == 'positive':
    [352](https://vscode-remote+vscode-002d01hkcwhn6mva5nchdpjsd1bsb3-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/datasets/vhr10.py:352)     # Download annotations

File [/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/datasets/utils.py:145](https://vscode-remote+vscode-002d01hkcwhn6mva5nchdpjsd1bsb3-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/datasets/utils.py:145), in download_and_extract_archive(url, download_root, extract_root, filename, md5)
    [143](https://vscode-remote+vscode-002d01hkcwhn6mva5nchdpjsd1bsb3-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/datasets/utils.py:143) archive = os.path.join(download_root, filename)
    [144](https://vscode-remote+vscode-002d01hkcwhn6mva5nchdpjsd1bsb3-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/datasets/utils.py:144) print(f'Extracting {archive} to {extract_root}')
--> [145](https://vscode-remote+vscode-002d01hkcwhn6mva5nchdpjsd1bsb3-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/datasets/utils.py:145) extract_archive(archive, extract_root)

File [/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/datasets/utils.py:99](https://vscode-remote+vscode-002d01hkcwhn6mva5nchdpjsd1bsb3-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/datasets/utils.py:99), in extract_archive(src, dst)
     [97](https://vscode-remote+vscode-002d01hkcwhn6mva5nchdpjsd1bsb3-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/datasets/utils.py:97) for suffix, extractor in suffix_and_extractor:
     [98](https://vscode-remote+vscode-002d01hkcwhn6mva5nchdpjsd1bsb3-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/datasets/utils.py:98)     if src.endswith(suffix):
---> [99](https://vscode-remote+vscode-002d01hkcwhn6mva5nchdpjsd1bsb3-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/datasets/utils.py:99)         with extractor(src, 'r') as f:
    [100](https://vscode-remote+vscode-002d01hkcwhn6mva5nchdpjsd1bsb3-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/datasets/utils.py:100)             f.extractall(dst)
    [101](https://vscode-remote+vscode-002d01hkcwhn6mva5nchdpjsd1bsb3-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/datasets/utils.py:101)         return

File [/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/datasets/utils.py:48](https://vscode-remote+vscode-002d01hkcwhn6mva5nchdpjsd1bsb3-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/datasets/utils.py:48), in _rarfile.RarFile.__enter__(self)
     [45](https://vscode-remote+vscode-002d01hkcwhn6mva5nchdpjsd1bsb3-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/datasets/utils.py:45) rarfile = lazy_import('rarfile')
     [46](https://vscode-remote+vscode-002d01hkcwhn6mva5nchdpjsd1bsb3-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/datasets/utils.py:46) # TODO: catch exception for when rarfile is installed but not
     [47](https://vscode-remote+vscode-002d01hkcwhn6mva5nchdpjsd1bsb3-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/datasets/utils.py:47) # unrar/unar/bsdtar
---> [48](https://vscode-remote+vscode-002d01hkcwhn6mva5nchdpjsd1bsb3-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/datasets/utils.py:48) return rarfile.RarFile(*self.args, **self.kwargs)

File [/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/rarfile.py:711](https://vscode-remote+vscode-002d01hkcwhn6mva5nchdpjsd1bsb3-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/rarfile.py:711), in RarFile.__init__(self, file, mode, charset, info_callback, crc_check, errors, part_only)
    [708](https://vscode-remote+vscode-002d01hkcwhn6mva5nchdpjsd1bsb3-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/rarfile.py:708) if mode != "r":
    [709](https://vscode-remote+vscode-002d01hkcwhn6mva5nchdpjsd1bsb3-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/rarfile.py:709)     raise NotImplementedError("RarFile supports only mode=r")
--> [711](https://vscode-remote+vscode-002d01hkcwhn6mva5nchdpjsd1bsb3-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/rarfile.py:711) self._parse()

File [/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/rarfile.py:930](https://vscode-remote+vscode-002d01hkcwhn6mva5nchdpjsd1bsb3-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/rarfile.py:930), in RarFile._parse(self)
    [928](https://vscode-remote+vscode-002d01hkcwhn6mva5nchdpjsd1bsb3-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/rarfile.py:928)     self._file_parser = p5  # noqa
    [929](https://vscode-remote+vscode-002d01hkcwhn6mva5nchdpjsd1bsb3-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/rarfile.py:929) else:
--> [930](https://vscode-remote+vscode-002d01hkcwhn6mva5nchdpjsd1bsb3-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/rarfile.py:930)     raise NotRarFile("Not a RAR file")
    [932](https://vscode-remote+vscode-002d01hkcwhn6mva5nchdpjsd1bsb3-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/rarfile.py:932) self._file_parser.parse()
    [933](https://vscode-remote+vscode-002d01hkcwhn6mva5nchdpjsd1bsb3-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/rarfile.py:933) self.comment = self._file_parser.comment

NotRarFile: Not a RAR file

I get the same error from the command line:

⚡ ~/data unrar e NWPU\ VHR-10\ dataset.rar 

UNRAR 5.61 beta 1 freeware      Copyright (c) 1993-2018 Alexander Roshal

NWPU VHR-10 dataset.rar is not RAR archive
No files to extract

OK it appears the file downloaded by torchgeo was somehow corrupted - if I manually download via the browser there is no issue.

However I then get Dataset not found in root='data', resolved by commenting out self._check_integrity.

Appears to be because there is a check: Checking integrity of data/NWPU VHR-10 dataset.rar which I deleted.

I just comment out that check and the dataset loads fine, however the image is not contrast stretched:

That is resolved using percentile_normalization and I am now ready to go:

isaaccorley · 2024-05-24T13:15:48Z

@robmarkcole what version of torchvision are you using? You can turn off the integrity check by passing checksum=False to the dataset/datamodule.

robmarkcole · 2024-05-24T13:20:56Z

'0.6.0.dev0' via pip install git+https://github.com/microsoft/torchgeo.git@main#egg=torchgeo

Can confirm no issues using checksum=False so long as the data is downloaded without corruption and the rar is present.

There appears to be another issue, which I believe is due to collate_fn_detection not being applied - on using the dataset I get error:

[252](https://vscode-remote+vscode-002d01hkcwhn6mva5nchdpjsd1bsb3-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/trainers/detection.py:252) """Compute the validation metrics.
    [253](https://vscode-remote+vscode-002d01hkcwhn6mva5nchdpjsd1bsb3-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/trainers/detection.py:253) 
    [254](https://vscode-remote+vscode-002d01hkcwhn6mva5nchdpjsd1bsb3-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/trainers/detection.py:254) Args:
   (...)
    [257](https://vscode-remote+vscode-002d01hkcwhn6mva5nchdpjsd1bsb3-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/trainers/detection.py:257)     dataloader_idx: Index of the current dataloader.
    [258](https://vscode-remote+vscode-002d01hkcwhn6mva5nchdpjsd1bsb3-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/trainers/detection.py:258) """
    [259](https://vscode-remote+vscode-002d01hkcwhn6mva5nchdpjsd1bsb3-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/trainers/detection.py:259) x = batch['image']
--> [260](https://vscode-remote+vscode-002d01hkcwhn6mva5nchdpjsd1bsb3-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/trainers/detection.py:260) batch_size = x.shape[0]
    [261](https://vscode-remote+vscode-002d01hkcwhn6mva5nchdpjsd1bsb3-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/trainers/detection.py:261) y = [
    [262](https://vscode-remote+vscode-002d01hkcwhn6mva5nchdpjsd1bsb3-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/trainers/detection.py:262)     {'boxes': batch['boxes'][i], 'labels': batch['labels'][i]}
    [263](https://vscode-remote+vscode-002d01hkcwhn6mva5nchdpjsd1bsb3-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/trainers/detection.py:263)     for i in range(batch_size)
    [264](https://vscode-remote+vscode-002d01hkcwhn6mva5nchdpjsd1bsb3-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/trainers/detection.py:264) ]
    [265](https://vscode-remote+vscode-002d01hkcwhn6mva5nchdpjsd1bsb3-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/trainers/detection.py:265) y_hat = self(x)

AttributeError: 'list' object has no attribute 'shape'

Usage:

class VHR10DataModule(L.LightningDataModule):
    def __init__(self, data_dir: str = "", batch_size: int = 4, num_workers: int = 0,):
        super().__init__()
        self.data_dir = data_dir
        self.batch_size = batch_size
        self.num_workers = num_workers

    def setup(self, stage: str):
        return 

    def train_dataloader(self):
        return DataLoader(dataset, batch_size=self.batch_size, collate_fn=collate_fn_detection, num_workers=self.num_workers)

    def val_dataloader(self):
        return DataLoader(dataset, batch_size=self.batch_size, collate_fn=collate_fn_detection, num_workers=self.num_workers)

    def test_dataloader(self):
        return DataLoader(dataset, batch_size=self.batch_size, collate_fn=collate_fn_detection, num_workers=self.num_workers)

datamodule = VHR10DataModule(data_dir="data", batch_size=4, num_workers=0)
datamodule.setup("fit")

isaaccorley · 2024-05-24T13:41:07Z

The collate fn is being applied but the trainer doesn't accept a list of images but expects it to be a tensor only which definitely is a bug when each image is a different size in the VHR-10 dataset they can't be stacked properly.

robmarkcole · 2024-05-24T13:48:54Z

In this batch, the images all had different shapes - presume I just need to add a cropping augmentation?

torch.Size([3, 808, 958])
torch.Size([3, 806, 950])
torch.Size([3, 803, 889])
torch.Size([3, 732, 946])

isaaccorley · 2024-05-24T13:50:38Z

Yep that's correct. I know in the past that Kornia had some bugs with the augmentations not properly being applied to the boxes but that appears to have been fixed.

robmarkcole · 2024-05-24T13:52:33Z

OK just noticed from torchgeo.datamodules import VHR10DataModule which handles this :-)

So in summary:

The rar file being corrupted was the cause of my issue
The plottling is off without percentile_normalization

isaaccorley · 2024-05-24T14:48:24Z

Oof I thought you were already using it or I would have suggested the datamodule. I'll take a look at fixes for this. Thanks for being an A+ test engineer!

adamjstewart · 2024-05-25T13:48:24Z

Trying to catch up on this thread...

@calebrob6 do we have any contacts we can use to upgrade the default torchgeo version on PC?

Well, that didn't age well. Looks like PC will be shutting down, so no need to worry about this anymore.

The rar file being corrupted was the cause of my issue

If we're seeing intermittent issues with GDrive, we could rehost the dataset on HF. It appears to be released under an MIT license.

The plottling is off without percentile_normalization

I'm happy to submit a PR to fix this, but then no one will review it... Does anyone else want to submit a PR?

@ashnair1 was the last person to touch this dataset.

calebrob6 · 2024-05-25T13:51:08Z

PC hub (the free compute) is shutting down, the 50+ PB of data hosting and APIs that let you index into it, explorer for visualizing it, and catalog are all unchanged AFAIK

ashnair1 · 2024-05-28T11:55:32Z

Good catch regarding normalization. By default the images are uint8 and are loaded as floats. During training the images are normalized (in the datamodule) to a range of 0-1 before plotting which is why the training plots look normal. However while plotting samples via the method directly, the tensor has values that range from 0-255 and is in float dtype making the plot incorrect.

adamjstewart · 2024-05-31T09:11:16Z

@burakekim is going to inquire about redistributing VHR-10 on Hugging Face, which will allow us to get rid of the Google Drive issues and remove dependencies on rarfile and gdown. Hopefully that will solve some of the issues you encountered!

I think we should also replace the README example with a simpler dataset like EuroSAT, which will finally close this issue.

adamjstewart · 2024-08-06T13:57:49Z

#2210

douglasmacdonald added the documentation Improvements or additions to documentation label May 19, 2024

adamjstewart added the datasets Geospatial or benchmark datasets label May 19, 2024

adamjstewart changed the title ~~README.md benchman dataset code~~ README.md benchmark dataset code May 19, 2024

adamjstewart added this to the 0.5.3 milestone May 25, 2024

adamjstewart added the good first issue A good issue for a new contributor to work on label May 25, 2024

robmarkcole mentioned this issue May 28, 2024

Update VHR-10 dataset plotting #2092

Merged

adamjstewart modified the milestones: 0.5.3, 0.6.0 Aug 6, 2024

adamjstewart removed this from the 0.6.0 milestone Aug 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md benchmark dataset code #2069

README.md benchmark dataset code #2069

douglasmacdonald commented May 19, 2024

adamjstewart commented May 19, 2024

douglasmacdonald commented May 19, 2024

adamjstewart commented May 19, 2024 •

edited

Loading

isaaccorley commented May 19, 2024

adamjstewart commented May 19, 2024

robmarkcole commented May 24, 2024 •

edited

Loading

isaaccorley commented May 24, 2024

robmarkcole commented May 24, 2024 •

edited

Loading

isaaccorley commented May 24, 2024

robmarkcole commented May 24, 2024

isaaccorley commented May 24, 2024

robmarkcole commented May 24, 2024 •

edited

Loading

isaaccorley commented May 24, 2024

adamjstewart commented May 25, 2024

calebrob6 commented May 25, 2024

ashnair1 commented May 28, 2024

adamjstewart commented May 31, 2024

adamjstewart commented Aug 6, 2024

README.md benchmark dataset code #2069

README.md benchmark dataset code #2069

Comments

douglasmacdonald commented May 19, 2024

Issue

Fix

adamjstewart commented May 19, 2024

douglasmacdonald commented May 19, 2024

adamjstewart commented May 19, 2024 • edited Loading

isaaccorley commented May 19, 2024

adamjstewart commented May 19, 2024

robmarkcole commented May 24, 2024 • edited Loading

isaaccorley commented May 24, 2024

robmarkcole commented May 24, 2024 • edited Loading

isaaccorley commented May 24, 2024

robmarkcole commented May 24, 2024

isaaccorley commented May 24, 2024

robmarkcole commented May 24, 2024 • edited Loading

isaaccorley commented May 24, 2024

adamjstewart commented May 25, 2024

calebrob6 commented May 25, 2024

ashnair1 commented May 28, 2024

adamjstewart commented May 31, 2024

adamjstewart commented Aug 6, 2024

adamjstewart commented May 19, 2024 •

edited

Loading

robmarkcole commented May 24, 2024 •

edited

Loading

robmarkcole commented May 24, 2024 •

edited

Loading

robmarkcole commented May 24, 2024 •

edited

Loading