Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix libc_nonshared.a #17702

Merged
merged 10 commits into from
Jan 5, 2024
Merged

Fix libc_nonshared.a #17702

merged 10 commits into from
Jan 5, 2024

Conversation

rootbeer
Copy link
Contributor

Build on #16970 with additional CLs for "weak_hidden_alias" and re-enabling disabled tests.

@rootbeer
Copy link
Contributor Author

CI tests are passing. The two CI failures seem to be native-native-musl failures:

run test std-x86_64-linux-none-Debug-selfhosted-no-lld: error: Illegal instruction at address 0x81febbe
run test std-x86_64-linux-none-Debug-selfhosted-no-lld: error: while executing test 'test.Condition - signal', the following command terminated with signal 4 (expected exited with code 0):

(I'm not clear if there are any subsequent test cases that were skipped because of this failure, or if its isolated to a single test or test file.)

This failure seem unlikely to be due to the changes here because of the "-musl" target .... but they don't look familiar either. I can rebase or wait for other fixes. (Or file a new bug.)

Running this stack locally, I can get my "fstat" repro case to pass with a variety of glibc versions (2.19, 2.25, 2.38, etc). That is an improvement, too.

Oddly, really old glibc versions don't compile (e.g. -target native-native-gnu.2.10) --- lots of missing symbols. I haven't tried them before, so I don't think that's related to this CL. (I'm not sure what the oldest supported glibc should be ... v2.16? v2.0.0?)

@squeek502
Copy link
Collaborator

The CI failures are intermittent failures affecting current master as well, so likely unrelated to this PR: https://github.com/ziglang/zig/commits/master

@rootbeer rootbeer marked this pull request as ready for review October 25, 2023 04:14
@rootbeer rootbeer force-pushed the correct-libc-nonshared branch from 67c7237 to c565672 Compare October 25, 2023 04:15
@rootbeer
Copy link
Contributor Author

I think this is good to go then. I'll see if I can add some tests of some sort that cover different glibc versions.

@rootbeer
Copy link
Contributor Author

(I don't mean to steal @lifthrasiir's PRs, which are the bulk of this stack, so I'm happy to donate my patches to his stack if that keeps the attributions correct.)

@lifthrasiir
Copy link
Contributor

lifthrasiir commented Oct 25, 2023

(I don't mean to steal @lifthrasiir's PRs, which are the bulk of this stack, so I'm happy to donate my patches to his stack if that keeps the attributions correct.)

No problem! It seems that my PR is drifted away (again) from the master anyway, so I'll pull this PR after CI passes. Andy closed the original PR in favor of this :-)

@squeek502
Copy link
Collaborator

squeek502 commented Oct 25, 2023

Seems like it'd probably be good to add a standalone test for this. My initial thought was something very simple to test for #17034 like:

const std = @import("std");

test {
    try std.testing.expectError(error.FileNotFound, std.fs.cwd().statFile("test_file_that_doesnt_exist"));
}

with different glibc_versions in build.zig. Unsure how the host's glibc version would affect the usefulness of the test.

Unsure what a test for #16152 would look like.

@rootbeer
Copy link
Contributor Author

I found test/link/glibc_compat/build.zig, which I expanded with some more tests. I compile some simple code (calls fstat, reallocarray, getauxval, and atexit) against a bunch of different glibc versions and makes sure it runs. I was going to submit it as a separate follow-on PR, but I can add it here if you'd like. See dfde1ae.

@rootbeer
Copy link
Contributor Author

Oh, and my suggested tests are just covering the "positive" cases of glibc symbols being linked correctly. I don't test things like older versions not having newer symbols (e.g., reallocarray should not be found in earlier versions). I could probably figure out how to make a test for this work, if there are some existing tests of linker errors I can follow as an example?

And, I don't test that symbols are left unresolved in the resulting binary (e.g., testing that Zig does not erroneously link in static implementations). I could probably hack a shell script together using objdump or readelf, but I'm not sure that's portable enough to be useful?

@andrewrk
Copy link
Member

I could probably hack a shell script together using objdump or readelf, but I'm not sure that's portable enough to be useful?

@kubkon might have some build.zig API tips to help with checking objects. For example check out some of the linker tests in test/link/*

@rootbeer rootbeer force-pushed the correct-libc-nonshared branch from c565672 to 3550e35 Compare October 28, 2023 05:31
@rootbeer
Copy link
Contributor Author

I've rebased to today's changes, and I've added two commits for a glibc test case.

One commit is the basic test case I linked before. It should help ensure glibc exposes symbols at the correct version during compilation. The test is executed too, to see that things work at run-time.

I think I figured out how to make the Elf "checkObject()" tests work against the glibc test case, too. I'm not sure I'm holding the API correctly, but the tests do seem to pass. And they do seem to fail when I break things. (I couldn't get it to check that a "FUNC LOCAL HIDDEN" symbol showed up correctly, see the commented out checkExacts.)

@rootbeer rootbeer force-pushed the correct-libc-nonshared branch 3 times, most recently from 8b10733 to 17fe336 Compare October 28, 2023 19:20
@rootbeer
Copy link
Contributor Author

Heh, tests fail when running executables linked against the newer glibc versions, as the host can't execute them. Will fix that.

@rootbeer rootbeer force-pushed the correct-libc-nonshared branch 2 times, most recently from beafee3 to 313d1cc Compare October 30, 2023 18:39
@rootbeer
Copy link
Contributor Author

This stack of commits could use some reviews, I think. The underlying commits (from @lifthrasiir) to fix libc_nonshared are the bulk of the change, and they're pretty solid and unchanged by me. In addition to fixing the "fstatat errno problem", this stack removes support for linking glibc versions before v2.17, generally (#17769). And adds some tests that build against different versions of glibc, and check that dynamic symbols are present as expected.

The tests I added to test/link/glibc_compat seem to work, but I'm not sure I'm using the exe.checkObject() APIs to check the work of the linker as well I could.

I added a README.md to lib/libc/glibc/. Anyone could review that (especially if you don't know much about Zig's support for glibc).

The commits that add "bounds checking" to glibc.zig, target.zig, and print_targets.zig seem to work, but I'm not sure if I should use a different approach (e.g., move more of the checks into target.zig), or if the architecture-specific bounds checks are worth the effort.

exe.linkLibC();

// Only try running the test if the host glibc is known to be good enough
if (minor_ver <= running_glibc_minor) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this could be incorporated in our runner much like foreign test runner. What do you think @andrewrk ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems complicated to determine the runtime linker's support if there is an emulator in play? (I guess the test infra could probe for that?) And I also think these does-runtime-support-newer-glibc checks are probably only needed in this test? Other tests should just be using the "default" glibc if they're explicitly linking?

That said, I'm willing to try merging this check into the test infra if folks think that is worthwhile.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use the SemanticVersion methods for comparison please. Also, I think you should be able to remove this condition since you have skip_foreign_checks = true which is defined to skip the check if the host cannot execute the binary. If the build system needs an improvement to make this work correctly with respect to glibc versions, I'm happy to help with that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will work on making the skip_foreign_checks = true check for the glibc supported versions. Without this hack here, the CI machines would build binaries that failed to execute. I can do this as a follow-on to this PR, or will update this PR with the changes if you want to wait for that.

(And I've switched to SemanticVersion.order for the version checking.)

Copy link
Member

@kubkon kubkon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests look good! I just have a few nits I would like you to try.

test/link/glibc_compat/build.zig Outdated Show resolved Hide resolved
test/link/glibc_compat/build.zig Outdated Show resolved Hide resolved
test/link/glibc_compat/glibc_runtime_check.zig Outdated Show resolved Hide resolved
@rootbeer rootbeer force-pushed the correct-libc-nonshared branch from 313d1cc to 1f2f1c8 Compare October 31, 2023 02:46
@rootbeer
Copy link
Contributor Author

Rebased and updated with fixes to the test case. (I marked the review feedback that I completed as "resolved" in the UI here, hope that's the right protocol.)

@rootbeer rootbeer force-pushed the correct-libc-nonshared branch 2 times, most recently from 3482682 to 4a73e20 Compare November 8, 2023 23:43
@rootbeer rootbeer force-pushed the correct-libc-nonshared branch from 4a73e20 to 112d326 Compare November 15, 2023 04:39
@rootbeer
Copy link
Contributor Author

I believe this PR is ready for more review feedback or for merging. (The open review discussion item is asking about merging the runtime-glibc-version-check into the core test runner or not.)

@rootbeer
Copy link
Contributor Author

rootbeer commented Nov 29, 2023

I've updated this PR to fix a bug in my glibc bounds checking (to use target.abi.isGnu()).

Also, I discovered that strlcpy and strlcat were new in glibc 2.38, but Zig's glibc header files were not reflecting that. Added CLs to fix that and test them. (These are great for testing the "skip_foreign_checks" path because my locally installed v2.36 glibc doesn't have these symbols.)

I spent a bit of time investigating how to extend Run.zig to gracefully hide/handle the case where a test fails because the host's glibc version isn't sufficient for the generated binary. These dynamic linker failures are fundamentally different from the existing skip_foreign_checks handling in Run.zig because the binary actually executes (the dynamic linker returns 1 and prints a message to stderr) where all the existing failures hidden by skip_foreign_checks are from errors during exec (the test binary doesn't even start from the parent's point of view).

Decoding the dynamic linker failures after the fact seems sketchy. I could add preventative code to runCommand that preemptively skips these tests if they look like they'll fail, but it seems simpler to keep this check in the glibc_compat/build.zig which is presumably where all the glibc multi-version tests will live anyway?

@rootbeer rootbeer force-pushed the correct-libc-nonshared branch 2 times, most recently from 6b3d80c to dd18830 Compare December 4, 2023 18:54
@rootbeer rootbeer force-pushed the correct-libc-nonshared branch 2 times, most recently from db8487a to d23b2bf Compare December 14, 2023 07:04
@rootbeer rootbeer force-pushed the correct-libc-nonshared branch from d23b2bf to fcd78ba Compare December 26, 2023 18:05
@rootbeer rootbeer force-pushed the correct-libc-nonshared branch from fcd78ba to 8ca5d8a Compare January 3, 2024 04:04
lifthrasiir and others added 10 commits January 4, 2024 17:12
The scope of libc_nonshared.a was greatly changed in glibc 2.33 and
2.34, but only the change from 2.34 was reflected so far. Glibc 2.33
finally switched to versioned symbols for stat functions, meaning that
libc_nonshared.a no longer contains them since 2.33. Relevant files were
therefore reverted to 2.32 versions and renamed accordingly.

This commit also removes errno.c, which was probably added to
libc_nonshared.a based on a wrong assumption that glibc/include/errno.h
requires glibc/csu/errno.c. In reality errno.h should refer to
__libc_errno (not to be confused with the public __errno_location),
which should be imported from libc.so. The inclusion of errno.c resulted
in wrong compile options as well; this commit fixes them as well.

Fixes ziglang#16152
The fstat,lstat,stat,mknod stubs used to build older (before v2.33)
glibc versions depend on the weak_hidden_alias macro.  It was removed
from the glibc libc-symbols header, so patch it back in for the older
builds.
glibc variants now support the stat-family of calls correctly, so
this test is safe to include.
Compile, link and run a test case against various glibc versions.
Exercise symbols that have been probelmatic in the past.
Add a README with an overview of how Zig's glibc support is implemented.
At a minimum required glibc is v2.17, as earlier versions do not define
some symbols (e.g., getauxval()) used by the Zig standard library.

Additionally, glibc only supports some architectures at more recent
versions (e.g., riscv64 support came in glibc v2.27).  So add a
`glibc_min` field to `available_libcs` for architectures with stronger
version requirements.

Extend the existing `canBuildLibC` function to check the target against
the Zig minimum, and the architecture/os minimum.

Also filter the list shown by `zig targets`, too:

  $ zig targets | jq -c '.glibc'
  ["2.17.0","2.18.0","2.19.0","2.20.0","2.21.0","2.22.0","2.23.0","2.24.0","2.25.0","2.26.0","2.27.0","2.28.0","2.29.0","2.30.0","2.31.0","2.32.0","2.33.0","2.34.0","2.35.0","2.36.0","2.37.0","2.38.0"]

Fixes ziglang#17034
Fixes ziglang#17769
So only expose these in generic-glibc/string.h if Zig is building
a v2.38 (or later) glibc stub.

Announcement of 2.38 that notes strlcpy and strlcat:
https://lists.gnu.org/archive/html/info-gnu/2023-07/msg00010.html
The strlcpy symbol was added in v2.38, so this is a handy symbol for
creating binaries that won't run on relatively modern systems (e.g., mine,
that has glibc 2.36 installed).
* fix typos and redundancies in docs
* use Target.isGnuLibc
@andrewrk andrewrk force-pushed the correct-libc-nonshared branch from 8ca5d8a to 362460e Compare January 5, 2024 00:28
@andrewrk andrewrk merged commit 4583667 into ziglang:master Jan 5, 2024
0 of 10 checks passed
@andrewrk
Copy link
Member

andrewrk commented Jan 5, 2024

Fantastic work, @rootbeer. Thank you also for keeping this up to date - you did quite a few rebases between when you opened it and when it was merged.

ghost pushed a commit to uber/hermetic_cc_toolchain that referenced this pull request Jan 5, 2024
Notable changes:
- bump glibc support from 2.34 to 2.38
- fix integration with some glibc symbols:
  ziglang/zig#17702
- mitigat `error: AccessDenied`, which, according to Noah
  Santschi-Cooney, also mitigates/fixes the dreaded `error:
  FileNotFound` which we sometimes see when setting up Bazel worksace.
  Relevant commit in Zig:
  ziglang/zig@6b27096

Fixes #133
ghost pushed a commit to uber/hermetic_cc_toolchain that referenced this pull request Jan 8, 2024
Notable changes:
- bump glibc support from 2.34 to 2.38
- fix integration with some glibc symbols:
  ziglang/zig#17702
- mitigat `error: AccessDenied`, which, according to Noah
  Santschi-Cooney, also mitigates/fixes the dreaded `error:
  FileNotFound` which we sometimes see when setting up Bazel worksace.
  Relevant commit in Zig:
  ziglang/zig@6b27096

Fixes #133
ghost pushed a commit to uber/hermetic_cc_toolchain that referenced this pull request Jan 8, 2024
Notable changes:
- bump glibc support from 2.34 to 2.38
- fix integration with some glibc symbols:
  ziglang/zig#17702
- mitigat `error: AccessDenied`, which, according to Noah
  Santschi-Cooney, also mitigates/fixes the dreaded `error:
  FileNotFound` which we sometimes see when setting up Bazel worksace.
  Relevant commit in Zig:
  ziglang/zig@6b27096

Fixes #133
motiejus pushed a commit to uber/hermetic_cc_toolchain that referenced this pull request Jan 8, 2024
Notable changes:
- bump glibc support from 2.34 to 2.38
- fix integration with some glibc symbols:
  ziglang/zig#17702
- mitigat `error: AccessDenied`, which, according to Noah
  Santschi-Cooney, also mitigates/fixes the dreaded `error:
  FileNotFound` which we sometimes see when setting up Bazel worksace.
  Relevant commit in Zig:
  ziglang/zig@6b27096

Fixes #133
@rootbeer rootbeer deleted the correct-libc-nonshared branch April 29, 2024 17:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants