You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I started a new project with the GraphCast example. I wanted to test it on the synthetic dataset before downloading the ERA5 data, but it turned out that the loss.py requires missing metadata from ERA5.
I installed Modulus from docker (24.07), installed missing mlflow with pip, changed the num_samples_per_year_train to 1 to fit into the memory, and started training with the synthetic dataset python train_graphcast.py synthetic_dataset=true.
It asks for metadata from the ERA5 dataset.
Minimum reproducible example
python train_graphcast.py synthetic_dataset=true
### Relevant log output
```shell
root@be1b9fafbee5:/data/codes/modulus/examples/weather/graphcast# python train_graphcast.py synthetic_dataset=true
/usr/local/lib/python3.10/dist-packages/modulus/distributed/manager.py:346: UserWarning: Could not initialize using ENV, SLURM or OPENMPI methods. Assuming this is a single process job
warn(
[11:10:46 - main - INFO] Rank: 0, Device: cuda:0
[11:10:46 - main - WARNING] Using Dummy dataset. Ignoring static dataset, cosine zenith angle, time of the year, and history. Also setting num_workers to 0.
[11:10:47 - main - INFO] Using torch.bfloat16 dtype
[11:10:47 - main - WARNING] Static dataset path is not provided. Setting num_channels_static to 0.
[11:10:57 - main - INFO] Model parameter count is 35296329
Generated synthetic temperature data in 4.07 seconds.
[11:11:02 - main - INFO] Loaded training datapipe of size 0
Error executing job with overrides: ['synthetic_dataset=true']
Traceback (most recent call last):
File "/data/codes/modulus/examples/weather/graphcast/train_graphcast.py", line 349, in main
trainer = GraphCastTrainer(cfg, dist, rank_zero_logger)
File "/data/codes/modulus/examples/weather/graphcast/train_graphcast.py", line 211, in __init__
self.criterion = GraphCastLossFunction(
File "/usr/local/lib/python3.10/dist-packages/modulus/utils/graphcast/loss.py", line 129, in __init__
self.channel_dict = self.get_channel_dict(dataset_metadata_path, channels_list)
File "/usr/local/lib/python3.10/dist-packages/modulus/utils/graphcast/loss.py", line 173, in get_channel_dict
with open(dataset_metadata_path, "r") as f:
FileNotFoundError: [Errno 2] No such file or directory: '/data/era5_75var/metadata/data.json'
Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.
Environment details
docker run --gpus all --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 --runtime nvidia --rm -it nvcr.io/nvidia/modulus/modulus:xx.xx bash
pip install mlflow
The text was updated successfully, but these errors were encountered:
Hi, I've downloaded the dataset using the modulus/examples/weather/dataset_download that has a metadata.json in it, right now I am trying to run the code with that one, but I still don't have a working test so I don't know if that's the right file.
Also, I'm not sharing the file because I'm unsure about the copyright terms of the dataset.
Hi, I've downloaded the dataset using the modulus/examples/weather/dataset_download that has a metadata.json in it, right now I am trying to run the code with that one, but I still don't have a working test so I don't know if that's the right file.
Also, I'm not sharing the file because I'm unsure about the copyright terms of the dataset.
Version
0.7.0
On which installation method(s) does this occur?
Docker
Describe the issue
Hi,
I started a new project with the GraphCast example. I wanted to test it on the synthetic dataset before downloading the ERA5 data, but it turned out that the
loss.py
requires missing metadata from ERA5.I installed Modulus from docker (24.07), installed missing
mlflow
with pip, changed thenum_samples_per_year_train
to 1 to fit into the memory, and started training with the synthetic datasetpython train_graphcast.py synthetic_dataset=true
.It asks for metadata from the ERA5 dataset.
Minimum reproducible example
Environment details
The text was updated successfully, but these errors were encountered: