Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛[BUG]: GraphCast example with syntetic dataset requires ERA5 metadata #630

Open
negedng opened this issue Aug 6, 2024 · 4 comments
Open
Assignees
Labels
? - Needs Triage Need team to review and classify bug Something isn't working

Comments

@negedng
Copy link

negedng commented Aug 6, 2024

Version

0.7.0

On which installation method(s) does this occur?

Docker

Describe the issue

Hi,

I started a new project with the GraphCast example. I wanted to test it on the synthetic dataset before downloading the ERA5 data, but it turned out that the loss.py requires missing metadata from ERA5.

I installed Modulus from docker (24.07), installed missing mlflow with pip, changed the num_samples_per_year_train to 1 to fit into the memory, and started training with the synthetic dataset python train_graphcast.py synthetic_dataset=true.

It asks for metadata from the ERA5 dataset.

Minimum reproducible example

python train_graphcast.py synthetic_dataset=true


### Relevant log output

```shell
root@be1b9fafbee5:/data/codes/modulus/examples/weather/graphcast# python train_graphcast.py synthetic_dataset=true
/usr/local/lib/python3.10/dist-packages/modulus/distributed/manager.py:346: UserWarning: Could not initialize using ENV, SLURM or OPENMPI methods. Assuming this is a single process job
  warn(
[11:10:46 - main - INFO] Rank: 0, Device: cuda:0
[11:10:46 - main - WARNING] Using Dummy dataset. Ignoring static dataset, cosine zenith angle,                                time of the year, and history. Also setting num_workers to 0.
[11:10:47 - main - INFO] Using torch.bfloat16 dtype
[11:10:47 - main - WARNING] Static dataset path is not provided. Setting num_channels_static to 0.
[11:10:57 - main - INFO] Model parameter count is 35296329
Generated synthetic temperature data in 4.07 seconds.
[11:11:02 - main - INFO] Loaded training datapipe of size 0
Error executing job with overrides: ['synthetic_dataset=true']
Traceback (most recent call last):
  File "/data/codes/modulus/examples/weather/graphcast/train_graphcast.py", line 349, in main
    trainer = GraphCastTrainer(cfg, dist, rank_zero_logger)
  File "/data/codes/modulus/examples/weather/graphcast/train_graphcast.py", line 211, in __init__
    self.criterion = GraphCastLossFunction(
  File "/usr/local/lib/python3.10/dist-packages/modulus/utils/graphcast/loss.py", line 129, in __init__
    self.channel_dict = self.get_channel_dict(dataset_metadata_path, channels_list)
  File "/usr/local/lib/python3.10/dist-packages/modulus/utils/graphcast/loss.py", line 173, in get_channel_dict
    with open(dataset_metadata_path, "r") as f:
FileNotFoundError: [Errno 2] No such file or directory: '/data/era5_75var/metadata/data.json'

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.

Environment details

docker run --gpus all --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 --runtime nvidia --rm -it nvcr.io/nvidia/modulus/modulus:xx.xx bash

pip install mlflow
@negedng negedng added ? - Needs Triage Need team to review and classify bug Something isn't working labels Aug 6, 2024
@mnabian mnabian self-assigned this Aug 6, 2024
@wlu1998
Copy link

wlu1998 commented Sep 12, 2024

May I ask if you have found this file?

@negedng
Copy link
Author

negedng commented Sep 12, 2024

Hi, I've downloaded the dataset using the modulus/examples/weather/dataset_download that has a metadata.json in it, right now I am trying to run the code with that one, but I still don't have a working test so I don't know if that's the right file.

Also, I'm not sharing the file because I'm unsure about the copyright terms of the dataset.

@wlu1998
Copy link

wlu1998 commented Sep 12, 2024

Hi, I've downloaded the dataset using the modulus/examples/weather/dataset_download that has a metadata.json in it, right now I am trying to run the code with that one, but I still don't have a working test so I don't know if that's the right file.

Also, I'm not sharing the file because I'm unsure about the copyright terms of the dataset.

ok!Thank u!!

@negedng
Copy link
Author

negedng commented Oct 6, 2024

hmm, no this file doesn't work for me :(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
? - Needs Triage Need team to review and classify bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants