-
Notifications
You must be signed in to change notification settings - Fork 543
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MPI communicator generalizations #1277
base: develop
Are you sure you want to change the base?
Conversation
Hi @jmsexton03, thank you for submitting this fix to #1261. I've started the regression testing for it. Just a note that @JessicaMeixner-NOAA is on travel this week, and we'd also like to allow time for trusted repo managers to do their own tests, so please allow some extra processing time for that. Early next week is a good tentative time to hear back. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a few comments for consideration now. Additional testing and further review is likely necessary as I have only been able to have a quick look, but wanted to provide feedback as soon as possible.
@@ -282,7 +282,7 @@ | |||
}, | |||
{ | |||
"name": "NL2", | |||
"build_files": ["w3snl2md.F90", "mod_xnl4v5.f90", "serv_xnl4v5.f90", "mod_fileio.f90", "mod_constants.f90"], | |||
"build_files": ["w3snl2md.F90", "mod_xnl4v5.f90", "serv_xnl4v5.f90", "mod_fileio.f90", "mod_constants.f90", "mod_mpicomm.f90"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not see a file mod_mpicomm.f90.
Also I think mpicomm.F90 should be adde with the MPI switch not NL2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can revert this change, I was initially confused about whether mod files were separately needed
(I will wait on this fix for further clarity on whether a separate module is needed)
model/src/mpicomm.F90
Outdated
!> @author J. M. Sexton @date 19-Jul-2024 | ||
!> | ||
! | ||
#ifndef ENDIANNESS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you help me understand why this is added here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file was modelled off of constants, this ifdef is probably not needed.
model/src/w3initmd.F90
Outdated
@@ -410,6 +410,7 @@ SUBROUTINE W3INIT ( IMOD, IsMulti, FEXT, MDS, MTRACE, ODAT, FLGRD, FLGR2, FLGD, | |||
UA, UD, U10, U10D, AS | |||
#ifdef W3_MPI | |||
USE W3ADATMD, ONLY: MPI_COMM_WAVE, MPI_COMM_WCMP | |||
USE MPICOMM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For all of these USE MPICOM, if we can add the "only" statement that would be nice. I know it's not uniformly done in WW3 right now, but since this is an addition, it would be good to add it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll push a commit to make these all have an only statement
regtests/bin/matrix_cmake_datarmor
Outdated
@@ -46,7 +46,7 @@ fi | |||
#convert main_dir to absolute path | |||
main_dir="`cd $main_dir 1>/dev/null 2>&1 && pwd`" | |||
|
|||
cmplr=datarmor_intel |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mickaelaccensi is this change okay?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change was for testing only, I will push a commit reverting it, or if you want to do "Applying suggestions" through github I can also do that
model/src/mpicomm.F90
Outdated
!/ ------------------------------------------------------------------- / | ||
!/ | ||
! | ||
!#ifdef W3_MPI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please clean up this commented out code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are you referring to the ifdef W3_MPI here? I was unsure if I followed the header convention correctly
@@ -3,6 +3,7 @@ set(c_src w3getmem.c) | |||
# Core files always built | |||
set(ftn_src | |||
constants.F90 | |||
mpicomm.F90 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's unclear to me if this should only be included if you have the MPI switch and therefore be added in model/src/cmake/switches.json instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did try to include ifdefs to make mpicomm.F90 not affect anything if it was compiled without MPI, but I can try to change this to model/src/cmake/switches.json since that sounds cleaner (I will wait on this fix for further clarity on whether a separate module is needed)
#endif | ||
! | ||
!/ ------------------------------------------------------------------- / | ||
MODULE MPICOMM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a fairly lightweight routine, should we add this to an existing file such as constants?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what the ramifications would be here, I'll consult with one of my collaborators who'd originally suggested having it as a separate file. I believe the intent was to make it more portable and obviously defined. Would w3parall.F90 be appropriate? I do see that IS_ESMF_COMPONENT
is in constants.F90
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can move things to another location if that's your preference. When you know, can you point me to where specifically so that I can push an update?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
w3parall is mostly for unstructured grids, where as this is for all grids, so I think constants might be a better choice. Others can weigh in on this thought too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation, and I'm also interested in others thoughts 🙂 I was originally looking at w3parall because the SYNCHRONIZE_GLOBAL_ARRAY has been useful in our coupling tests so far, although I do see now that the header for this file indicates it's "Parallel routines for implicit solver"
.github/workflows/regtest_gnu.yml
Outdated
@@ -102,10 +102,10 @@ jobs: | |||
export CC=mpicc | |||
export FC=mpif90 | |||
export OASISDIR=${GITHUB_WORKSPACE}/work_oasis3-mct | |||
# mkdir build && cd build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these changes intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These changes were from attempting to test with debug flags on, but I should revert them (particularly as that seemed to not be a good way to build with debug)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree to add it into constants.ftn.
I approve this PR with the condition it must pass all the regtests done by Matt
@mickaelaccensi it looks like there are still many unresolved conversations above. I'll be posting this shortly, but we will be taking a pause on this and other PR processing temporarily. This is a significant enough of a PR that we'll want to have all code managers test on their respective machines when we resume work on this. |
Hi @jmsexton03, thank you for your patience on this, and I apologize I haven't given a status update sooner. Our team has had some temporary changes that affect our ability to process PR's, so thanks for bearing with us while we worked behind the scenes to adjust. I just posted our group statement regarding this on the Commit Queue: Temporary PR Policy. We tentatively hope to resume mid to late November. This PR has some great advantages, we'll look forward to being able to look at it carefully when we pick back up our processing in November. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jmsexton03 -Thank you for your patience with the delayed follow up. Let's move forward with moving your module to constants and then we can start re-reviewing from there.
Pull Request Summary
Description
Summary of changes:
These changes extend some of the OASIS related changes by using a separate communicator that is specifically WW3-related for all routines. This makes communication safer by avoiding conflicting tags, and makes collective communications more maintainable.
Issue(s) addressed
This PR is associated with issue #1261
This PR addresses changes to make the use of MPI more general to allow for easier coupling.
Commit Message
MPI communicator generalizations
Check list
Testing