Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update OS to Ubuntu 24.04 and ROOT to 6.34.02 #103

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open

Conversation

tvami
Copy link
Member

@tvami tvami commented Sep 24, 2024

I am adding a new package to the container, here are the details.

Resolves #102

What new packages does this PR add to the development container?

  • I included ROOT via an image officially from ROOT
  • That image already has 24.04 Ubuntu
  • I removed Pythia6 and GENIE, given the plans to move to Pythia8 and a new version of GENIE

Check List

  • I successfully built the container using docker
docker build -t  my-ldmx-sw-new .

works nicely.

  • I was able to build ldmx-sw using this new container build
just init
just use my-ldmx-sw-new:latest
just compile 

This fails, but with the problem already known in LDMX-Software/ldmx-sw#1470
Accept that now even with the minimal compiler settings it fails.
--> So I'm blocked here

  • I was able to test run a small simulation and reconstruction inside this container
# outline of test instructions
cd $LDMX_BASE/ldmx-sw/build
ldmx ctest
cd ..
for c in `ls ldmx-sw/*/test/*.py`; ldmx fire $c; done
  • I was able to successfully use the new packages. Explain what you did to test them below:

@tvami
Copy link
Member Author

tvami commented Sep 26, 2024

Now that the TS stuff is fixed, I'm looking at this again. I see in the log that

#29 ERROR: failed to push ldmx/dev: Canceled: grpc: the client connection is closing
------
 > exporting to image:
------
ERROR: failed to solve: Canceled: failed to push ldmx/dev: Canceled: grpc: the client connection is closing

I dont know what grpc msg is supposed to mean. @tomeichlersmith ?

@tomeichlersmith
Copy link
Member

I have never seen this before either. It may be caused by our custom runner at UMN shutting down? I'm not sure, you can try to see if it works after trying again.

@tvami tvami marked this pull request as ready for review September 26, 2024 21:27
@tvami
Copy link
Member Author

tvami commented Sep 26, 2024

Ah I guess so
"The self-hosted runner: runner-0 lost communication with the server. Verify the machine is running and has a healthy network connection. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error."
from here
https://github.com/LDMX-Software/docker/actions/runs/11019831593

Anyway, I'll re-try again

@tvami
Copy link
Member Author

tvami commented Sep 27, 2024

Screenshot 2024-09-26 at 17 07 13

cool the build does seem to work.
As for testing with older version of ldmx-sw, I didnt really think about that. If it works, all good, but if it doesnt, can we somehow say that some older version should be used with some older version of the docker image too?

@tomeichlersmith
Copy link
Member

tomeichlersmith commented Sep 27, 2024

There is precedence for not supporting older ldmx-sw versions with newer development container images; however, this is not enforced anywhere. Instead it is simply documented online: https://ldmx-software.github.io/developing/compatibility.html

We could do something similar here. Cut a new major version of the development image and point out that you need to use a newer ldmx-sw with the newer development image.

One thing I do want to point out though is that the test of trunk is failing as well, so it may be an issue with the CI or image itself and not with the older ldmx-sw versions.

Edit: nevermind, it looks like ldmx-sw just needs some patches after we update the compiler. This makes sense and has been required before. Similar to the linked documentation online, I'd like there to be some detail on why older ldmx-sw versions can't be compiled just in case someone wants to go backward.

@tomeichlersmith
Copy link
Member

I thought about this a bit over the weekend and I have two more thoughts.

I would really like to keep the CI testing back to v4.0 at least, so I would request that you focus on figuring out the necessary patch for v4.0.0 (the easiest case is that its the same patch necessary to get trunk working in which case we just need to automate it).

I don't like how I originally implemented by backport solution (basically just using sed to change the code https://github.com/LDMX-Software/docker/blob/main/.github/interop/force-legacy-onnx.sh). I think a patch file is more clear to potential readers, potentially more stable, and easier to implement (just go to that tag, hack until it compiles and runs, and then git diff > version.patch). The main potential issue with patch files is that they could interact oddly with the git submodules we used to have, but I think that can be easily avoided by simply using diff and patch rather than git diff and git apply since the non-git commands just use filepaths which have stayed mostly consistent. I don't expect this to be implemented here unless you find switching to this patch-file solution is easier than developing another sed script to patch previous ldmx-sw versions.

@tomeichlersmith
Copy link
Member

tomeichlersmith commented Jan 28, 2025

Unfortunately, we might need to drop back to building ROOT ourselves :/

We need C++20 for acts but ROOT's build of 6.32.04 uses C++17 which leads to a pile of compilation warnings. For example:

/opt/root/include/RConfigure.h:30:4: warning: #warning "The C++ standard in this build does not match ROOT configuration (201703L); this might cause unexpected issues" [-Wcpp]
   30 | #  warning "The C++ standard in this build does not match ROOT configuration (201703L); this might cause unexpected issues"
      |    ^~~~~~~

@tomeichlersmith
Copy link
Member

I got to the point where ROOT was successfully being configured so I'm pushing my work now. Hopefully it builds successfully.

@tomeichlersmith
Copy link
Member

I'm getting close to putting this onto main but not making a new release while we test and update ldmx-sw. This would then mean (for a short time), a the OS update would be available under ldmx/dev:edge but not ldmx/dev:latest (I think).

ldmx-sw Updates

First pass, building without additional warnings/checks enabled (just configure-quick).

Necessary

In order to compile, we need only a few more includes.

diff --git a/Conditions/include/Conditions/SimpleTableCondition.h b/Conditions/include/Conditions/SimpleTableCondition.h
index d34d930d..b69f81c7 100644
--- a/Conditions/include/Conditions/SimpleTableCondition.h
+++ b/Conditions/include/Conditions/SimpleTableCondition.h
@@ -8,6 +8,7 @@
 
 #include <ostream>
 #include <vector>
+#include <cstdint>
 
 #include "Framework/ConditionsObject.h"
 #include "Framework/Exception/Exception.h"
diff --git a/TrigScint/src/TrigScint/Firmware/clusterproducer_sw.cxx b/TrigScint/src/TrigScint/Firmware/clusterproducer_sw.cxx
index 140011d9..e3129f1c 100644
--- a/TrigScint/src/TrigScint/Firmware/clusterproducer_sw.cxx
+++ b/TrigScint/src/TrigScint/Firmware/clusterproducer_sw.cxx
@@ -1,5 +1,5 @@
 #include <stdio.h>
-
+#include <array>
 #include <iostream>
 
 #include "TrigScint/Firmware/clusterproducer.h"

Warnings

Remember, these are the ones that are present without additional warnings enabled.
The full build log is available here (read with less -R since I dumped the colors too).

  • std::iterator is deprecated, need to find replacement for HgcrocDigiCollection
  • update configure Python for the specific python version included in Ubuntu 24.04
  • overflow potential in recon::PFTruthProducer when using ldmx::SimParticle::getParents

Additional Warnings

Just copied from prior notes, not sure if still relevant.

  • lots of useless-cast and format-nonliteral warnings from ROOT headers, may go away when we updated standard
  • null-dereference in EventFile::importRunHeaders
  • fabs does nothing on unsigned bool, probably actually a bug in TS Tracking! Full warning message below. I'm guessing its a misplaced ):
/home/tom/code/ldmx/ldmx-sw/TrigScint/src/TrigScint/TrigScintTrackProducer.cxx:333:13: warning: taking the absolute value of unsigned type 'bool' has no effect [clang-diagnostic-absolute-value]
  333 |         if (fabs(track.getCentroid() - nextTrack.getCentroid() <
      |             ^
/home/tom/code/ldmx/ldmx-sw/TrigScint/src/TrigScint/TrigScintTrackProducer.cxx:333:13: note: remove the call to 'fabs' since unsigned values cannot be negative
  333 |         if (fabs(track.getCentroid() - nextTrack.getCentroid() <
      |             ^~~~
/home/tom/code/ldmx/ldmx-sw/TrigScint/src/TrigScint/TrigScintTrackProducer.cxx:333:13: warning: implicit conversion turns floating-point number into bool: 'double' to 'bool' [clang-diagnostic-implicit-conversion-floating-point-to-bool]
  333 |         if (fabs(track.getCentroid() - nextTrack.getCentroid() <
      |         ~~  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  334 |                  3 * maxDelta_)) {
  • moving a temporary value in Process::getDummy()
  • uintptr_t instead of std::uintptr_t in SimCore/Factory

not sure if this will work, but I'm hoping it does
@tomeichlersmith tomeichlersmith linked an issue Jan 29, 2025 that may be closed by this pull request
@tomeichlersmith
Copy link
Member

The current build is failing because of a typo I made, but after I fix that typo and attempt to build locally, it still fails because GENIE requires TPythia6.

RandomGen.cxx:14:10: fatal error: TPythia6.h: No such file or directory
   14 | #include <TPythia6.h>
      |          ^~~~~~~~~~~~
compilation terminated.
make[1]: *** [/usr/local/src/GENIE/Generator/src/make/Make.include:609: RandomGen.o] Error 1
make[1]: Leaving directory '/usr/local/src/GENIE/Generator/src/Framework/Numerical'
make: *** [Makefile:74: framework] Error 2

Given the name of the header and @wesketchum comments about requiring ROOT to be built with Pythia6 wrappers, I think this means that this version of GENIE requires not only Pythia6 but a ROOT-wrapped Pythia6. This would mean we would be limited to ROOT 6.30 or earlier1 which I do not want to consider as an option since I want to have a ROOT version that has a standardized RNTuple which I am now realizing is ROOT 6.34.002 so I need to update it again.

@wesketchum What do you want to do here? It looks like Pythia6 is still used within the latest GENIE3 but it does look like it is guarded by enable/disable checks. From my perspective there are a few options.

  1. Drop GENIE from the mainline image. You can still build your own images and use our image as a reference for making sure that your image can build ldmx-sw as well as GENIE.
  2. Disable Pythia6 from GENIE which then requires Pythia8. Not sure how well tested this is or if it changes the physics that you are simulating.
  3. Do number (1) but then immediately work on number (2) so that there are any (or only a few) releases of ldmx/dev without GENIE in them. I could see Addition of HEPMC3 and GENIE Reweight to docker image #105 evolving into step (2) since you already have Pythia8 and an upgraded GENIE built there.

Footnotes

  1. I get this from the release notes of 6.30 where the pythia6 build module is deprecated and they state that it will be removed in 6.32. https://root.cern/doc/v630/release-notes.html#deprecated-and-removed-root-modules

  2. https://root.cern/releases/release-63400/

  3. Searching for TPythia6 in the GENIE source: https://github.com/search?q=repo%3AGENIE-MC%2FGenerator%20TPythia6&type=code

drop GENIE and its dep LHAPDF (for now), trying to get a build on
DockerHub so its easier to test run the new OS and updated ROOT on other
computers
@tomeichlersmith tomeichlersmith linked an issue Jan 30, 2025 that may be closed by this pull request
@tvami tvami changed the title Update OS to Ubuntu 24.04 and ROOT to 6.32.04 Update OS to Ubuntu 24.04 and ROOT to 6.34.02 Jan 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Jupyter interaction with ROOT Update OS to Ubuntu 24.04 Boost package does not contain Boost JSON
2 participants