Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tests failing due to linker issues #23

Open
difcsi opened this issue Nov 6, 2024 · 6 comments
Open

Tests failing due to linker issues #23

difcsi opened this issue Nov 6, 2024 · 6 comments

Comments

@difcsi
Copy link

difcsi commented Nov 6, 2024

Tests such as the generic/time are failing due to issues with eagerly dynamic-linked symbols, usually calloc.

Given the library is mature enough, the convoluted way of linking the tests may be unnecessary.

@stephenrkell
Copy link
Owner

Thanks Zoltan. Here for my memory's sake I am going to elaborate a bit on the problem, although I haven't fixed it yet.

in test/generic we see failures like the following:

$ ./time
./time: symbol lookup error: ./time: undefined symbol: calloc, version GLIBC_2.2.5

... even though:

$ objdump -T ./time | grep calloc

i.e. calloc is not a symbol that the binary links against.

My suspicion is that the dynamic linker internally depends on a calloc during its symbol binding pathway, and the message is reflecting that somehow. If so, perhaps the change (which broke these tests) was something like a switch from eager to lazy binding or vice-versa, changing when this pathway is needed and/or what it does.

@stephenrkell
Copy link
Owner

Here is the backtrace from a failing request for calloc within the dynamic linker on my system (glibc 2.36).

#0  _dl_debug_vdprintf (fd=<optimized out>, tag_p=<optimized out>, tag_p@entry=1, 
    fmt=<optimized out>, fmt@entry=0x7ffff7ff360a "%s: error: %s: %s (%s)\n", 
    arg=arg@entry=0x7fffffffc7c8) at ../sysdeps/unix/sysv/linux/dl-writev.h:36
#1  0x00007ffff7fd7aea in _dl_debug_printf (
    fmt=fmt@entry=0x7ffff7ff360a "%s: error: %s: %s (%s)\n") at dl-printf.c:234
#2  0x00007ffff7fe308c in _dl_signal_cexception (errcode=0, exception=0x7fffffffc930, 
    occasion=<error reading variable: Cannot access memory at address 0xffffc8a8>)
    at dl-error-skeleton.c:136
#3  0x00007ffff7fd4b09 in _dl_lookup_symbol_x (
    undef_name=undef_name@entry=0x7ffff7ff3718 "calloc", 
    undef_map=undef_map@entry=0x7ffff7ffe300, ref=ref@entry=0x7fffffffc9a0, 
    symbol_scope=<optimized out>, version=version@entry=0x7fffffffc9d0, )
    at dl-lookup.c:797
#4  0x00007ffff7fe4102 in lookup_malloc_symbol (main_map=main_map@entry=0x7ffff7ffe300, 
    name=name@entry=0x7ffff7ff3718 "calloc", version=version@entry=0x7fffffffc9d0)
    at dl-minimal.c:64
#5  0x00007ffff7fe4219 in __rtld_malloc_init_real (main_map=main_map@entry=0x7ffff7ffe300)
    at dl-minimal.c:91
#6  0x00007ffff7fe95ff in dl_main (phdr=<optimized out>, phnum=<optimized out>, 
    user_entry=<optimized out>, auxv=<optimized out>) at rtld.c:2373
#7  0x00007ffff7fe50bf in _dl_sysdep_start (
    start_argptr=start_argptr@entry=0x7fffffffcdd0, 
    dl_main=dl_main@entry=0x7ffff7fe6d20 <dl_main>)
    at ../sysdeps/unix/sysv/linux/dl-sysdep.c:140
#8  0x00007ffff7fe6a2e in _dl_start_final (
    arg=<error reading variable: Cannot access memory at address 0xffffcd38>) at rtld.c:496
#9  _dl_start (arg=<optimized out>) at rtld.c:583
#10 0x00007ffff7fe58e8 in _start () from /lib64/ld-linux-x86-64.so.2

@stephenrkell
Copy link
Owner

And in dl-minimal.c we find:

85        struct r_found_version version;
86        version.name = symbol_version_string (libc, GLIBC_2_0);
87        version.hidden = 0;
88        version.hash = _dl_elf_hash (version.name);
89        version.filename = NULL;
90
91        void *new_calloc = lookup_malloc_symbol (main_map, "calloc", &version);
92        void *new_free = lookup_malloc_symbol (main_map, "free", &version);
93        void *new_malloc = lookup_malloc_symbol (main_map, "malloc", &version);
94        void *new_realloc = lookup_malloc_symbol (main_map, "realloc", &version);

i.e. the code is assuming there is a calloc (and others) available, even though in the binary I have, it is not linked in.

We could work around this by bundling a malloc implementation, like dlmalloc or even just a dummy no-free one, into our test cases. To me it seems like a bug that the ld.so is assuming it has a libc... when I get a moment I'll file this on the Bugzilla and we can see what they say.

@stephenrkell
Copy link
Owner

Looking at an old version of the glibc ld.so, I find that it did have calloc and friends as weak definitions. This holds up to at least version 2.31.

This commit might be relevant.

@stephenrkell
Copy link
Owner

Actually, this one is the culprit.

@stephenrkell
Copy link
Owner

I've now reported this here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants