Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FYI, unable to build RDASApp in a computing node #165

Open
guoqing-noaa opened this issue Sep 11, 2024 · 6 comments
Open

FYI, unable to build RDASApp in a computing node #165

guoqing-noaa opened this issue Sep 11, 2024 · 6 comments

Comments

@guoqing-noaa
Copy link
Collaborator

Some platforms highly prefer compiling using computing nodes and only allow very small build jobs on the login nodes.
A computing node will allow using much more processors to compile concurrently so as to speed up the build process.

This issue is just to document the attempt to build RDASApp in a computing node.
In many platforms, a computing node differs from the login nodes in that it cannot access the internet.

Previously, RDASApp would download lots of data on the fly during the build process. We have made efforts to reduce that dependency. But apparently, that's not enough.

This is to document the error message I got:

-- Unknown compiler: Intel
-- Configure MPAS for internal ESMF
[ 11%] Creating directories for 'mpas_data-populate'
[ 22%] Performing download step (git clone) for 'mpas_data-populate'
Cloning into 'mpas_data-src'...
fatal: unable to access 'https://github.com/MPAS-Dev/MPAS-Data.git/': Failed to connect to github.com port 443 after 130757 ms: Couldn't connect to server
Cloning into 'mpas_data-src'...
fatal: unable to access 'https://github.com/MPAS-Dev/MPAS-Data.git/': Failed to connect to github.com port 443 after 130735 ms: Couldn't connect to server
Cloning into 'mpas_data-src'...
fatal: unable to access 'https://github.com/MPAS-Dev/MPAS-Data.git/': Failed to connect to github.com port 443 after 130684 ms: Couldn't connect to server
-- Had to git clone more than once: 3 times.
CMake Error at mpas_data-subbuild/mpas_data-populate-prefix/tmp/mpas_data-populate-gitclone.cmake:39 (message):
  Failed to clone repository: 'https://github.com/MPAS-Dev/MPAS-Data.git'
@ShunLiu-NOAA
Copy link

Could you specify which platform it is for the this case? We are currently support Orion, Hera and Jet only.

@guoqing-noaa
Copy link
Collaborator Author

Could you specify which platform it is for the this case? We are currently support Orion, Hera and Jet only.

@ShunLiu-NOAA Thanks for the question. I forgot to mention it is on Jet/Hera.
And this remind me to test Orion/Hercules. They may have different policies for computing nodes.

@guoqing-noaa
Copy link
Collaborator Author

guoqing-noaa commented Sep 11, 2024

Computing nodes on Orion/Hercules cannot access internet either

@TingLei-NOAA
Copy link
Contributor

The whole building process could be done in 3 steps :1) cloning(including submodules and configuring ; 2) data cloning/downloading ; 3 a clean "make " step without need to access to internet. .
Maybe some extra re-organization is needed to separate step 2 from step 3.

@guoqing-noaa
Copy link
Collaborator Author

@TingLei-NOAA Thanks for the input!

@guoqing-noaa
Copy link
Collaborator Author

guoqing-noaa commented Sep 13, 2024

build.sh -j6  on jet takes 54m23s to complete
build.sh -j10 on jet takes 42m7s to complete
build.sh -j16 on jet takes 29m19s to complete
build.sh -j20 on jet takes 23m52s to complete
build.sh -j30 on jet takes 21m16s to complete

So we should go forward to remove any internet access part in the build process so that we can run the build process in a computing node and it will greatly reduce building time. This will be extremely helpful for Orion,

Timing stats when building mpasjedi only:

./build.sh -m MPAS -j6    #42m52s
./build.sh -m MPAS -j10   #29m53s
./build.sh -m MPAS -j16   #24m41s
./build.sh -m MPAS -j20   #21m54s
./build.sh -m MPAS -j30   #18m39s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants