Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix version symbol problem #10

Open
akawashiro opened this issue Jul 6, 2023 · 5 comments
Open

Fix version symbol problem #10

akawashiro opened this issue Jul 6, 2023 · 5 comments

Comments

@akawashiro
Copy link
Member

akawashiro commented Jul 6, 2023

We fail to handle the default version symbol now. But not sure.

https://github.com/pfnet/sold/pull/6/files#diff-ea346e7232da277fcbfd75eb0102ceed1fca1e6c180b2bd4f64d40773c83717bR527

@xuzijian629
Copy link
Contributor

I'm debugging this and I leave some memo here.

I temporarily revives the CHECKs disabled in #6 and ran ninja tests.
I got some errors and saw what (name, soname, version, versym) are actually caught by the checks.

In all failures, the symbols were __cxa_finalize. the corresponding soname and version are empty, but versym is neither 0 nor 1 (I observed 3, 6, and 9, in each failures).

In

CHECK(!soname.empty() && !version.empty()) << " versym=" << special_ver_ndx_to_str(versym);
, if the versym is not special (0 or 1), soname and version are expected to nonempty, and this check fails.

I tracked __cxa_finalize during AddSym

sym.index = AddSym(s);

At first, SymtabBuilder::Resolve was called with Resovle(__cxa_finalize, libc.so.6, GLIBC_2.2.5), adding symbol (name,soname,version,versym)=(__cxa_finalize,libc.so.6,GLIBC_2.2.5,6). this seems correct.
However, later it's also called with Resolve(__cxa_finalize, "", ""), generating symbol (__cxa_finalize,,,6)

the second versym 6 is copied from the first versym, but where did the first versym come from?
-> it was set by fallback

Other symbols processed by this line all had nonempty soname and version

So I changed

} else {

to

} else if (!soname.empty() && !version.empty()) {

and ninja succeeded (I don't know why)

@xuzijian629
Copy link
Contributor

ctest --output-on-failure still fails, so this would not be the correct way

@akawashiro
Copy link
Member Author

./tools/search-symbol.sh __cxa_finalize may work for you.

@akawashiro
Copy link
Member Author

In my case, with this fix 5ac192f,

$ ./tools/search-symbol.sh __cxa_finalize
   711: 0003a4c0   540 FUNC    GLOBAL DEFAULT   15 __cxa_finalize@@GLIBC_2.1.3 in /lib32/libc.so.6
   223: 00000000000459a0   662 FUNC    GLOBAL DEFAULT   15 __cxa_finalize@@GLIBC_2.2.5 in /lib/x86_64-linux-gnu/libc.so.6

@xuzijian629
Copy link
Contributor

thanks!
I confirmed that above change actually correctly resolves __cxa_finalize symbol.

(below is memo for myself)
but still I got error in CHECK(!soname.empty() && !version.empty()) when versym is not special (0 or 1).

the versym is 65535, which is NO_VERSION_INFO, set at

Elf_Versym v = versym_ ? versym_[idx] : NO_VERSION_INFO;

I updated is_special_ver_ndx as

bool is_special_ver_ndx(Elf64_Versym versym) {
    return (versym == VER_NDX_LOCAL || versym == VER_NDX_GLOBAL || versym == NO_VERSION_INFO);
}

and now feature_unit_tests passes in python3.8/ubuntu20.04.

Still, cuBLASLt and cuBLAS-related tests are failing in python3.8/ubuntu20.04 and I continue investigation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants