Experience with recompiling global model on differing OS #2256

erinaj16 · 2024-04-29T15:28:36Z

erinaj16
Apr 29, 2024

I am trying to diagnose some issues I am running into when testing the same version of the global model on a different operating system. For those who have tested the same version of the global model on a differing OS (such as during the recent transition to CentOS7 to Rocky8 on Hera and Jet) or even the same system on differing compilers, do you get different model output? If so, how large of a percentage difference do you get for a 6 h forecast in a specific variable (e.g., in wind speed near jet level)? Any prior experience related to this matter would be helpful!

Answered by RatkoVasic-NOAA

May 10, 2024

Looking into this plot, I'd say it is acceptable.

View full answer

gspetro-NOAA · 2024-05-01T13:19:58Z

gspetro-NOAA
May 1, 2024
Maintainer

Hi @erinaj16 ,

I just reached out to some of our engineers who worked through the CentOS-->Rocky8 transition, and I'm hoping one of them will get back to you shortly.

Best,
Gillian Petro | EPIC User Support

0 replies

RatkoVasic-NOAA · 2024-05-01T14:58:17Z

RatkoVasic-NOAA
May 1, 2024
Collaborator

@erinaj16 we did that kind of testing many times with operational model. Each time we were getting new machine (like every 5 years) we would do test to see how different results are (since it almost never was bit-identical). For the test, we would do small perturbation on one or more fields in initial conditions (sixth significant digit, using Mersenne Twister method) and look at differences (RMSE) during period of 5 days - that would be control (effect of random small differences). Next run will be on new machine and do same comparison with control (RMSE). Usually those differences were small (of course, growing with length of forecast).
Last transition (Hera and Jet) was the first time we transitioned on same hardware, but different OS. Results were surprising to me, using Intel compiler we even got bit-identical results, but using GNU we got small differences (as expected), This time (since it is not operational machine) I personally didn't perform test described above.
Now, for expecting differences after 6 hours. If only change was OS and nothing else, you shouldn't get any significant differences. For example, difference between temperature fields should look "spotty" with values up to 0.1K or 0.01K. If differences are bigger and fields are more pronounced, then it was not to OS change, it must be something else different.
I'm attaching difference after 5 days that we did some time ago.

0 replies

erinaj16 · 2024-05-01T19:15:00Z

erinaj16
May 1, 2024
Author

@RatkoVasic-NOAA Thank you for your insight! I am doing research utilizing the global model on Jet and needed to recompile the version of UFS as well as the versions of the dependency libraries I am using with both the new OS and a new compiler/MPI (from intel/18.0.5.274 and impi/2018.4.274 to intel/2022.1.2 and impi/2022.1.2). I am getting differences in temperature almost an order of magnitude larger (differences up to about 0.8K instead of 0.1K or lower) than what you are describing at 6h, and I am even getting differences at the 0h time. Could differences in compiler version and/or MPI version be leading to differences of this magnitude?

0 replies

RatkoVasic-NOAA · 2024-05-01T19:50:25Z

RatkoVasic-NOAA
May 1, 2024
Collaborator

Differences shouldn't be present in 00h for dynamics fields.
For physics yes, because although it says 00hrs, it is actually after one time step of physics. Otherwise many 2D fields would be zero.
So if you see differences in zero hour for upper fields (like temperature or wind), and you are sure you use exactly the same initial conditions, then I'd suspect on preprocessing (chgres or something like that). Problem is that you don't have access to old OS to actually test it and confirm where is the problem.

0 replies

erinaj16 · 2024-05-02T16:12:54Z

erinaj16
May 2, 2024
Author

In my test, I am not running chgres or any preprocessing. I am running a “warm start” using the exact same tiled initial conditions (and GSI-based increment files for IAU) created on the CentOS system and post-processing (which post-processing by itself does not have issues with reproducibility). Below are figures with the percentage difference at 0h and 120h for 250 hPa winds between the two systems. Is there a certain component of the initialization within the model that could be causing differences in these locations?

0 replies

RatkoVasic-NOAA · 2024-05-02T16:56:08Z

RatkoVasic-NOAA
May 2, 2024
Collaborator

@erinaj16 I see, there is a difference in 00hr field and it shouldn't be there even if you have different machine, compiler, OS, physics scheme... So answer to your question no, as far as I know, there's nothing in the model that might cause this difference.
BTW, maybe percentage is not good measure since it gives more significance to smaller values. Best measure would be either plain difference or RMSE, but in this case it wouldn't change anything.
Since you cannot go back and run it on old system, you can try to plot certain level for wind or temperature of the initial condition file, and plot same level, same variable at 00hr output, my guess is that they would be identical.

0 replies

erinaj16 · 2024-05-09T22:01:41Z

erinaj16
May 9, 2024
Author

@RatkoVasic-NOAA I have been looking into the issue a bit more. Because I am using 4DIAU, the "0h" plots from before are actually at 0h in reference to the central time for IAU, i.e., 3 h into model integration. Below is an updated plot showing plain differences at this 3h model integration time for U wind at 250 hPa. Given that it isn't true 0 h, are these differences more in line with what can be seen with a change in compiler (from intel/18.0.5.274 to intel/2022.1.2), or do they still seem too large from what you have seen in your experience?

1 reply

RatkoVasic-NOAA May 10, 2024
Collaborator

Looking into this plot, I'd say it is acceptable.

Answer selected by erinaj16

gspetro-NOAA · 2024-05-21T17:27:58Z

gspetro-NOAA
May 21, 2024
Maintainer

@erinaj16 It looks like Ratko was able to answer your questions. Do you have any remaining questions, or can we close the discussion? (You can also feel free to mark it as answered!)

1 reply

erinaj16 May 21, 2024
Author

I don't have any additional questions right now. Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Experience with recompiling global model on differing OS #2256

{{title}}

Replies: 8 comments 2 replies

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Experience with recompiling global model on differing OS #2256

erinaj16 Apr 29, 2024

Replies: 8 comments · 2 replies

gspetro-NOAA May 1, 2024 Maintainer

RatkoVasic-NOAA May 1, 2024 Collaborator

erinaj16 May 1, 2024 Author

RatkoVasic-NOAA May 1, 2024 Collaborator

erinaj16 May 2, 2024 Author

RatkoVasic-NOAA May 2, 2024 Collaborator

erinaj16 May 9, 2024 Author

RatkoVasic-NOAA May 10, 2024 Collaborator

gspetro-NOAA May 21, 2024 Maintainer

erinaj16 May 21, 2024 Author

erinaj16
Apr 29, 2024

Replies: 8 comments 2 replies

gspetro-NOAA
May 1, 2024
Maintainer

RatkoVasic-NOAA
May 1, 2024
Collaborator

erinaj16
May 1, 2024
Author

RatkoVasic-NOAA
May 1, 2024
Collaborator

erinaj16
May 2, 2024
Author

RatkoVasic-NOAA
May 2, 2024
Collaborator

erinaj16
May 9, 2024
Author

RatkoVasic-NOAA May 10, 2024
Collaborator

gspetro-NOAA
May 21, 2024
Maintainer

erinaj16 May 21, 2024
Author