Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Building on HPC cluster of CNAF-INFN #1029

Open
mtrocadomoreira opened this issue Oct 26, 2023 · 3 comments
Open

Building on HPC cluster of CNAF-INFN #1029

mtrocadomoreira opened this issue Oct 26, 2023 · 3 comments

Comments

@mtrocadomoreira
Copy link

Hello!

I would like to build HiPACE on this cluster. Before I start building the profile.hipace configuration, I just wanted to ask if there are any obvious hardware-dependent flags I should be aware of.

The available GPUs in this cluster are NVIDIA K20, K40, K1 and V100 (see full cluster hardware specs here).

Thanks!

@SeverinDiederichs
Copy link
Member

Hi Mariana,

thank you for reaching out. I would strongly recommend using the V100s, they are the most modern on the list. Do you know if they have 16 GB or 32 GB? That's not clear from the website.

The optimize for V100s, please add

export AMREX_CUDA_ARCH=7.0 # use 8.0 for A100 or 7.0 for V100

to your profile.hipace. Otherwise, you could also just add the flag -DAMREX_CUDA_ARCH=7.0 during compilation with cmake.

Please let us know if everything works, we could add the cluster to the documentation.

@mtrocadomoreira
Copy link
Author

Thanks for the speedy reply!

Wow, it actually compiled successfully with a very simple configuration file, almost at the first try 🥹 Thank you to everyone who contributed to making this so easy to compile!

Here's what I used for the profile.hipace.cnaf-infn file:

module load compilers/cmake-3.27.7
module load compilers/gcc-12.3_sl7
module load compilers/cuda-9.1
module load compilers/openmpi-4-1-5_gcc12.3

export AMREX_CUDA_ARCH=7.0

export CC=$(which gcc)
export CXX=$(which g++)
export FC=$(which gfortran)

Let me come back to this thread once I have run a job successfully, to confirm that everything is flowing smoothly.

Do you know if they have 16 GB or 32 GB?

No, I don't... Do you think this could pose a limitation for larger simulations?

@SeverinDiederichs
Copy link
Member

Happy to hear that the compilation worked!

Is there a newer version of cuda available on the cluster? Using a new compiler (GCC 12.3) but a very old cuda version (9.1) will probably not work. It would be great if there was something like cuda 11.8.

If it is not available, I'd suggest to ask the admins of the cluster whether they can install that.

No, I don't... Do you think this could pose a limitation for larger simulations?

16 GB could be a bit low for challenging simulations for AWAKE, 32 GB would be fine. You can certainly do quite a lot of things with 16 GB already but for what I assume you'd like to do 32 would be better.

If a simulation runs successfully on the GPU, it will tell you about how much memory it used.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants