Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update chgres_cube to process RRFS GRIB2 data #660

Open
GeorgeGayno-NOAA opened this issue Jun 23, 2022 · 19 comments
Open

Update chgres_cube to process RRFS GRIB2 data #660

GeorgeGayno-NOAA opened this issue Jun 23, 2022 · 19 comments
Assignees
Labels
enhancement New feature or request

Comments

@GeorgeGayno-NOAA
Copy link
Collaborator

This data uses a GRIB2 GDT template number of "1", which is not recognized by chgres. Other code updates may also be required.

@GeorgeGayno-NOAA GeorgeGayno-NOAA added the enhancement New feature or request label Jun 23, 2022
@GeorgeGayno-NOAA GeorgeGayno-NOAA self-assigned this Jun 23, 2022
GeorgeGayno-NOAA added a commit to GeorgeGayno-NOAA/UFS_UTILS that referenced this issue Jun 23, 2022
GeorgeGayno-NOAA added a commit to GeorgeGayno-NOAA/UFS_UTILS that referenced this issue Jun 24, 2022
used. That is the convention expected by the iplib.

Fixes ufs-community#660.
GeorgeGayno-NOAA added a commit to GeorgeGayno-NOAA/UFS_UTILS that referenced this issue Jun 27, 2022
@GeorgeGayno-NOAA
Copy link
Collaborator Author

Test case on Hera: /scratch1/NCEPDEV/da/George.Gayno/noscrub/eric.rrfs

Work on hold until G2 library issue is addressed - NOAA-EMC/NCEPLIBS-g2#36

@kgerheiser
Copy link
Contributor

Thanks for the test case

GeorgeGayno-NOAA added a commit to GeorgeGayno-NOAA/UFS_UTILS that referenced this issue Jun 29, 2022
@GeorgeGayno-NOAA
Copy link
Collaborator Author

Thanks for the test case

I added a print of the number of records read by the inventory loop. See 7185ed5. The test case only reads 389 records out of more than 1000. It stops exactly at the 2GB mark.

@GeorgeGayno-NOAA
Copy link
Collaborator Author

@LarissaReames-NOAA the branch I worked from is a year old. (Fortunately, I did not delete it.) Merging from 'develop' might result in tons of conflicts. I don't know if it would be better to start with a new branch, then manually add the updates from my old branch.

https://github.com/GeorgeGayno-NOAA/UFS_UTILS/tree/feature/chgres_rrfs

Let me try a merge.

GeorgeGayno-NOAA added a commit to GeorgeGayno-NOAA/UFS_UTILS that referenced this issue Sep 5, 2023
@GeorgeGayno-NOAA
Copy link
Collaborator Author

@LarissaReames-NOAA the branch I worked from is a year old. (Fortunately, I did not delete it.) Merging from 'develop' might result in tons of conflicts. I don't know if it would be better to start with a new branch, then manually add the updates from my old branch.

https://github.com/GeorgeGayno-NOAA/UFS_UTILS/tree/feature/chgres_rrfs

Let me try a merge.

The merge was easier than I thought. I had only changed model_grid.F90. The RRFS data is rotated lat/lon, but uses the official WMO GDT. The official template requires unrotated corner points. Our gdswzd routine requires rotated corner points. So I added logic to do the rotation. I then stopped work because I hit the 2GB file limit.

@LarissaReames-NOAA
Copy link
Collaborator

Sounds good. I'll start from your branch and use this Issue # in my commits.

@LarissaReames-NOAA
Copy link
Collaborator

LarissaReames-NOAA commented Sep 6, 2023

@GeorgeGayno-NOAA I'm running in to an issue with G2 with the RRFS_A natlev and prslev files that I think may be related to the size of the file. It's giving me an Error 99 " Request not found." when attempting to read surface pressure. I've checked with both grib_dump and ncl_filedump and surface pressure is indeed in the file and looks to have the same grib2 parameters provided to getgb2 near line 2636 in atm_input_data.F90. I also compared that entry from grib_dump with a HRRR grib2 file that we use for the regression tests and all of the relevant entries are identical. I've also looked more closely at the output and in both cases (nat and pres) it's only able to parse the upper ~15 vertical levels, which sounds to me like it's not generating a complete index of the file internally.

Update: I tried the f000 CONUS file that's available on AWS and that works fine. It's <1GB in total size.

@GeorgeGayno-NOAA
Copy link
Collaborator Author

@GeorgeGayno-NOAA I'm running in to an issue with G2 with the RRFS_A natlev and prslev files that I think may be related to the size of the file. It's giving me an Error 99 " Request not found." when attempting to read surface pressure. I've checked with both grib_dump and ncl_filedump and surface pressure is indeed in the file and looks to have the same grib2 parameters provided to getgb2 near line 2636 in atm_input_data.F90. I also compared that entry from grib_dump with a HRRR grib2 file that we use for the regression tests and all of the relevant entries are identical. I've also looked more closely at the output and in both cases (nat and pres) it's only able to parse the upper ~15 vertical levels, which sounds to me like it's not generating a complete index of the file internally.

Update: I tried the f000 CONUS file that's available on AWS and that works fine. It's <1GB in total size.

What version of the G2 library are you using?

@LarissaReames-NOAA
Copy link
Collaborator

@GeorgeGayno-NOAA I'm running in to an issue with G2 with the RRFS_A natlev and prslev files that I think may be related to the size of the file. It's giving me an Error 99 " Request not found." when attempting to read surface pressure. I've checked with both grib_dump and ncl_filedump and surface pressure is indeed in the file and looks to have the same grib2 parameters provided to getgb2 near line 2636 in atm_input_data.F90. I also compared that entry from grib_dump with a HRRR grib2 file that we use for the regression tests and all of the relevant entries are identical. I've also looked more closely at the output and in both cases (nat and pres) it's only able to parse the upper ~15 vertical levels, which sounds to me like it's not generating a complete index of the file internally.
Update: I tried the f000 CONUS file that's available on AWS and that works fine. It's <1GB in total size.

What version of the G2 library are you using?

I'm working on Jet using the standard module file, so it loads 3.4.5

@GeorgeGayno-NOAA
Copy link
Collaborator Author

@GeorgeGayno-NOAA I'm running in to an issue with G2 with the RRFS_A natlev and prslev files that I think may be related to the size of the file. It's giving me an Error 99 " Request not found." when attempting to read surface pressure. I've checked with both grib_dump and ncl_filedump and surface pressure is indeed in the file and looks to have the same grib2 parameters provided to getgb2 near line 2636 in atm_input_data.F90. I also compared that entry from grib_dump with a HRRR grib2 file that we use for the regression tests and all of the relevant entries are identical. I've also looked more closely at the output and in both cases (nat and pres) it's only able to parse the upper ~15 vertical levels, which sounds to me like it's not generating a complete index of the file internally.
Update: I tried the f000 CONUS file that's available on AWS and that works fine. It's <1GB in total size.

What version of the G2 library are you using?

I'm working on Jet using the standard module file, so it loads 3.4.5

Ok. That explains it. The G2 library fix was just tagged yesterday: https://github.com/NOAA-EMC/NCEPLIBS-g2/releases/tag/v3.4.7

The libraries team said they would install it 'soon'. You can try to clone and compiled it in your own space. Then adjust the ufs_utils build module to point to it.

@LarissaReames-NOAA
Copy link
Collaborator

LarissaReames-NOAA commented Sep 6, 2023

@GeorgeGayno-NOAA I'm running in to an issue with G2 with the RRFS_A natlev and prslev files that I think may be related to the size of the file. It's giving me an Error 99 " Request not found." when attempting to read surface pressure. I've checked with both grib_dump and ncl_filedump and surface pressure is indeed in the file and looks to have the same grib2 parameters provided to getgb2 near line 2636 in atm_input_data.F90. I also compared that entry from grib_dump with a HRRR grib2 file that we use for the regression tests and all of the relevant entries are identical. I've also looked more closely at the output and in both cases (nat and pres) it's only able to parse the upper ~15 vertical levels, which sounds to me like it's not generating a complete index of the file internally.
Update: I tried the f000 CONUS file that's available on AWS and that works fine. It's <1GB in total size.

What version of the G2 library are you using?

I'm working on Jet using the standard module file, so it loads 3.4.5

Ok. That explains it. The G2 library fix was just tagged yesterday: https://github.com/NOAA-EMC/NCEPLIBS-g2/releases/tag/v3.4.7

The libraries team said they would install it 'soon'. You can try to clone and compiled it in your own space. Then adjust the ufs_utils build module to point to it.

Installed the 3.4.7 locally, double checked it was correctly linked during install, problem remains. Parsing the file quits at 13 hybrid levels. Do I have to do something special to use the new large file capability? That wasn't clear in the release notes.

Also, if it helps at all, I'm getting several messages in the chgres_cube output that look like:
SAGT 0 0 5

From looking at the g2 code this supposedly indicates that it's finding an unknown grib section, but it doesn't seem to be a fatal error?

@edwardhartnett Could you provide any advice here?

@GeorgeGayno-NOAA
Copy link
Collaborator Author

@LarissaReames-NOAA do those odd messages begin at a certain record number and does that record correspond to the 2GB point?

@LarissaReames-NOAA
Copy link
Collaborator

LarissaReames-NOAA commented Sep 7, 2023

The first place they appear is in model_grid when the grib2 file is first opened and the first record is checked for the grid template definition:
` - OPEN AND READ INPUT DATA GRIB2 FILE:
/lfs4/NAGAPE/hpc-wof1/lreames/chgres_cube/reg_tests/input_data/rrfs.grib2/rrfs.
t00z.natlev.f000.grib2
SAGT 0 0 5
SAGT 0 0 5
SAGT 0 0 5
SAGT 0 0 5
SAGT 0 0 5
SAGT 0 0 5
SAGT 0 0 5
SAGT 0 0 5
SAGT 0 0 5
SAGT 0 0 5
SAGT 0 0 5
SAGT 0 0 5

  • INPUT DATA ON ROTATED LAT/LON GRID.
  • INPUT DATA ON ROTATED LAT/LON GRID.
  • INPUT DATA ON ROTATED LAT/LON GRID.`

The second time is is in atm_input_data.F90 during the check for the product definition number:
` - READ ATMOS DATA FROM GRIB2 FILE:
/lfs4/NAGAPE/hpc-wof1/lreames/chgres_cube/reg_tests/input_data/rrfs.grib2/rrfs.
t00z.natlev.f000.grib2
SAGT 0 0 5

  • DATA IS ON HYBRID LEVELS.`

So it doesn't actually show up in the loop over all entries to count hybrid levels. It looks like it's showing up when jpdtn=-1, aka when entries with any grid definition template number are searched. I'm not really sure if it has much bearing on the actual issue.

A total of 296 records are read in atm_input_data.F90. How can I know if that's the 2GB point? It's certainly close to the number of entries Kyle was encountering when his read was ending at 2GB

@GeorgeGayno-NOAA
Copy link
Collaborator Author

I did my own independent test using G2 v3.4.7 (https://github.com/NOAA-EMC/NCEPLIBS-g2/releases/tag/v3.4.7) and got the same chgres_cube error:

max, min U   -24.9200000762939        49.4799995422363
 max, min V   -43.0999984741211        58.0000000000000
 - CALL FieldScatter FOR INPUT U-WIND.
 - CALL FieldScatter FOR INPUT V-WIND.
 - READ SURFACE PRESSURE.
 - FATAL ERROR: READING SURFACE PRESSURE RECORD.
 - IOSTAT IS:           99

Will contact the library team.

GeorgeGayno-NOAA added a commit to GeorgeGayno-NOAA/UFS_UTILS that referenced this issue Sep 8, 2023
@edwardhartnett
Copy link
Collaborator

Is this a new error with 3.4.7? Or is this something that never worked?

@GeorgeGayno-NOAA
Copy link
Collaborator Author

Is this a new error with 3.4.7? Or is this something that never worked?

Some programs, such as chgres_cube, use routine getgb2 to read grib data.

https://github.com/NOAA-EMC/NCEPLIBS-g2/blob/develop/src/getgb2.F90#L47

Based on the value of the LUBI argument, that routine will either read an existing index file, create the index file, or force a regeneration of the index file.

@LarissaReames-NOAA
Copy link
Collaborator

Is this a new error with 3.4.7? Or is this something that never worked?

Some programs, such as chgres_cube, use routine getgb2 to read grib data.

https://github.com/NOAA-EMC/NCEPLIBS-g2/blob/develop/src/getgb2.F90#L47

Based on the value of the LUBI argument, that routine will either read an existing index file, create the index file, or force a regeneration of the index file.

In other words, it's never worked for files > 2GB. However, RRFS North American files are the first we've dealt with that are that large, so we've been able to get by without that capability until now.

@edwardhartnett
Copy link
Collaborator

OK, to resolve this we're going to have to introduce a new index format, which can handle > 2 GB files. So that will take a little work.

@JacobCarley-NOAA
Copy link

Chiming in to say this is a needed capability for both RRFSv1 (we need it to make ICs to drive the on-demand FIreWx nest) and for the 3DRTMA as well. I'm glad to see this is picking up steam again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants