-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
spurious build error "missing preprocessor_successor.h" #242
Comments
So this is what I found out so far: So the file ( but, all of hmm |
IIRC this is not exclusive to riotdocker, but just only happens when building on GitHub infrastructure (of which riotdocker is one major user as regular CI is on murdock); could be wrong though, failed to find the other build failures so far. |
Agreed. So, what's special about github workers? What comes to mind is that they're comparatively slow, and that they build from a clean slate every time... |
I still hope to find a "rational" explanation, but last time I was looking into that I had the suspicion that they might be running on some file system that isn't as strict about operations sequencing as Make would need it. Right now my next attempt would be to make riot- |
hm, TBH, github workers being used as CI for half the world, I wouldn't count on the FS being incompatible with make. |
What I find odd is that a |
Attempting things at #243 -- it adds up to 20s of sleep (finally failing) before accepting that preprocessor_successor.h is really not there. |
Having a reproduced setup would be awesome. I've been burning through 2x20 minutes of CI time because the Docker image gets rebuilt every time even though the rules have not changed (and I'd need to try out a few things more to see why the upgrade to the debug branch doesn't take). |
... but I can't catch it locally :( |
Waaah, all the time I've been testing and 01750a5e [edit: link fixed; branch is https://github.com/RIOT-OS/rust-riot-sys/tree/debug-preprocessor_successor-race] was not committed yet -- still, the last test wen through. I don't see the old runs, but this must have been the first in ~8 builds that went through. Actually pushed the race resolving sleep loop now, retrying ... if it passes on first try, it's either externally stateful (load of ... something?) or really fixed. |
That did pass on first try. Re-running #241 now. If that does pass, I'll assume there is some time-of-day-ish influence and let it be for now until the problem resurfaces. If that does not pass, I'll give this here a re-run (possibly with some extra debug output), and consider it preliminarily successful (in which case the next step is cleaning up the race workaround so that it is less brittle). |
The #241 rerun did fail, so let's try with a more verbose #243... [edit: keeping a tally] So far counted 3 successes on 243 and 2 failures on 241 up to 19:54CET. [edit] I'm a bit discouraged in the approach because I don't see the "preprocessor_successor.h not found,sleeping suspecting a race condition" warnings, but those may be silenced because it's not a local package. [edit] Did more tests: Indeed build system warnings from git-included modules are only shown when the whole thing fails. |
So far, this looks good. I'm running one more test with diff --git a/build.rs b/build.rs
index 05c216f..eead5d6 100644
--- a/build.rs
+++ b/build.rs
@@ -185,6 +185,32 @@ fn main() {
})
.collect();
+ // Debug helper for https://github.com/RIOT-OS/riotdocker/issues/242
+ let preprocessor_include = cflags
+ .iter()
+ .filter_map(|f| f.strip_prefix("-I"))
+ .filter(|f| f.ends_with("/preprocessor"))
+ .next();
+ if let Some(preprocessor_include) = preprocessor_include {
+ let filename = format!("{}/preprocessor_successor.h", preprocessor_include);
+ let mut countdown: usize = 10;
+ loop {
+ if std::path::PathBuf::from(&filename).exists() {
+ if countdown != 10 {
+ panic!("File reappeared. I would have caught a race condition, but chose not to for visibility");
+ }
+ break;
+ }
+ countdown = countdown.checked_sub(1).unwrap_or_else(|| {
+ panic!("Preprocessor include path {} was present but preprocessor_successor.h did not show up there after 20 seconds", filename);
+ });
+
+ // See also https://github.com/RIOT-OS/riotdocker/issues/242
+ println!("cargo:warning=preprocessor_successor.h not found,sleeping suspecting a race condition");
+ std::thread::sleep(std::time::Duration::from_secs(2));
+ }
+ }
+
let bindings = builder()
.header("riot-bindgen.h")
.clang_args(&cflags) which, while failing, will clearly state that there was a race condition. If that happens, I'll remove the panic again, and leave this workaround in until we manage to reproduce this on a system that we can debug. [edit: Great, now this passed, ie. the race condition did not occur. Retrying the build hoping for a failure -- both "yes I avoided the race condition" and "no race condition, the file never showed" would be valuable, but passing at this point is just annoying...] |
uff... I tried reproducing this locally, did probably 500 "buildtest" runs, trying to inch closer to how the actions start the build. Tried "make --shuffle". "make -j". "make -j1". delaying both creating the preprocessor.h and using it, surrounding it with printouts. ... It didn't fail a single time. (I'm open to the "file system buggy" idea now. 😄 ) |
Hah, found it! It's about the sequence of files in a directory listing. This was fixed in RIOT-OS/rust-riot-sys#36, which comes back to haunt us because riotdocker builds with the last stable RIOT version (currently 2023.10-branch). Shall we backport the riot-sys update, or just bump the branch we're using to test the images, given the 2024.01 release is imminent? |
I would say, bump branch! |
So far these have been "increment the branch and declare that we're building the next tag of CI systems". Can we do that already, or do we first set the RIOT branch to 2024.01 but still keep building for the 2024.01 release? |
Wow, I'm confused about this every time. 😏 |
I've opened a PR that bumps the branch: #246 |
The text was updated successfully, but these errors were encountered: