Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revamp CI with dependency and Python caching for efficient installs #3112

Open
wants to merge 45 commits into
base: develop
Choose a base branch
from

Conversation

johvincau
Copy link
Contributor

Continuous Integration (CI)

This PR resolves #2140, revamping the CI suite by adding cache support for external dependencies as well as Python environments for faster install times, while also preparing for future expansions with better readability and other QOL tweaks.

(note: There was a time when installing each version of OpenMC could take several hours owing to inefficiencies in Github's management of its runners. While this is no longer the case, even without this PR, I think no one would miss the time we spend building and installing these dependencies when we can now cache and restore them very quickly.)

List of Changes

1. Caches for built NJOY2016, DAGMC, libmesh, vectfit, ncrystal dependencies

We now have the ability to cache most of the external dependencies, which can save significantly on install time since they take a while to build. For the bigger dependencies like DAGMC and vectfit, the install step is cut by more than half.

In terms of staying up-to-date, gha-caches.sh fetches the latest commit hashes for the required dependencies. If the cache we have on hand does not match the latest commit hash, we will rebuild the dependency from the upstream source and overwrite the old cache.

image

One point I'd like to ask about is whether we should retain the cmake and build directories for each of the installed dependencies in our caches. The CI seems to get along just fine without those two, and only the source + executable -- at least for the DAGMC case, the only one I've tried yet.

2. Job-specific Python environment caches

This addition caches a unique Python environment for each of the 12 configurations, which further saves us install time. This is a list of the caches produced by this.

image

We have the spare cache storage for all of these. If, in the future, we need the space, simply distinguishing between omp, mpi, and omp+mpi installs of Python 3.10 would also be acceptable.

3. Setup and matrix.json for job-name readability

I've added a "setup" job which reads from matrix.json all configurations of OpenMC to be tested. All 12 of the previous configurations have been added, along with a "job-name" variable which makes the CI screen more readable. This is what it now looks like:

image

While this file will have to be manually maintained, it is at least more straightforward to add new configurations for testing.

4. Standardize across gha-install scripts

Some gha-install scripts needed to have -DCMAKE_INSTALL_PREFIX specified for caching to work, and for the executables to be accessible by the OpenMC install process.

5. ENDF-B/VII.1 cross section cache rename

I've renamed the cache to provide clarity on which nuclear data files we use for continuous integration -- being mindful of the possibility that we may need other files like ENDF-B/VIII.x for new tests in the future.

Checklist

  • I have performed a self-review of my own code
  • I have run clang-format (version 15) on any C++ source files (if applicable)
  • I have followed the style guidelines for Python source files (if applicable)
  • I have made corresponding changes to the documentation (if applicable)
  • I have added tests that prove my fix is effective or that my feature works (if applicable)

The commit messages may have some inconsistencies. I've had to commit and push every time to test the CI, and I've attempted to clean up with interactive rebasing.

Copy link
Contributor

@paulromano paulromano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @johvincau for this PR! The caching is nice and the job names are so much better than what we have currently (and will really help when sifting through failed jobs). A few suggestions for you:


# NJOY variables
NJOY_REPO='https://github.com/njoy/NJOY2016'
NJOY_INSTALL_DIR=$HOME/NJOY2016
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be nice to put all the installed dependencies in a single folder, e.g., $HOME/dependencies/NJOY2016. Also, in the case of NJOY (and probably the others), we should just be caching the installed version. Right now, the cache for NJOY is 1 GB when it should be much less.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've trimmed the NJOY repository by ~30% by removing its own NJOY-specific testing suite out of the cache; however, removing the build directory causes issues with the vectfit installation. Aside from NJOY and vectfit, all other of the dependencies at the very least remove their own build directories before caching.

I would like to move all the installed dependencies into a single folder as well; however, I will address that in a future PR since it would take a lot of testing and plenty of commits (and this PR already has enough of those as is!).

tools/ci/gha-install-mcpl.sh Outdated Show resolved Hide resolved
tools/ci/gha-install.sh Outdated Show resolved Hide resolved
tools/ci/gha-caches.sh Outdated Show resolved Hide resolved
.github/workflows/ci.yml Outdated Show resolved Hide resolved
@johvincau johvincau requested a review from paulromano August 16, 2024 18:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Github Actions: Cache source-build executables and libraries to speed up testing
2 participants