Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

android: use proper types for dirent.d_ino, dirent.d_off, stat.st_mode and stat64.st_mode #4216

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

thunder-coding
Copy link

@thunder-coding thunder-coding commented Dec 21, 2024

This bug was noticed in termux/termux-packages#22609

Description

Fixes types for dirent.d_ino, dirent.d_off, stat.st_mode and stat64.st_mode on Android.

The hardcoded types are incorrect and this PR changes them to how they are declared for Linux. Also fixes the incorrect type errors in termux/termux-packages#22609, where I found the bug (The builds are still failing due to missing posix_spawn* support here, but that's a different issue.)

This patch should likely also be backported to 0.2.x release, I have the patchset already applied on my other branch. Will link another PR soon.
It seems that this patch applies cleanly to libc-0.2 branch, I was backporting to an older release earlier. @rustbot label stable-nominated should be it according to CONTRIBUTING.md

Sources

mode_t:
mode_t is defined as unsigned int on 64-bit android platforms:
https://android.googlesource.com/platform/prebuilts/ndk/+/dev/platform/sysroot/usr/include/sys/types.h#64
https://android.googlesource.com/platform/prebuilts/ndk/+/dev/platform/sysroot/usr/include/asm-generic/posix_types.h#18
and is defined as unsigned short on 32-bit platforms:
https://android.googlesource.com/platform/prebuilts/ndk/+/dev/platform/sysroot/usr/include/i686-linux-android/asm/posix_types_32.h#9
https://android.googlesource.com/platform/prebuilts/ndk/+/dev/platform/sysroot/usr/include/arm-linux-androideabi/asm/posix_types.h#9

off_t:
https://android.googlesource.com/platform/bionic/+/main/docs/32-bit-abi.md mentions that off_t is i32 on 32-bit and i64 on 64-bit

ino_t:
https://android.googlesource.com/platform/prebuilts/ndk/+/dev/platform/sysroot/usr/include/dirent.h#64
https://android.googlesource.com/platform/prebuilts/ndk/+/dev/platform/sysroot/usr/include/sys/types.h#70
https://android.googlesource.com/platform/prebuilts/ndk/+/dev/platform/sysroot/usr/include/asm-generic/posix_types.h#14
Due to "historic accident", although ino_t is unsigned long on all platforms, the actual type used is unsigned long on 64-bit and uint64_t on 32-bit. I think we should just declare ino_t to be of type unsigned long on 64-bit and uint64_t on 32-bit as that's how code using these types is supposed to be used.

Checklist

  • Relevant tests in libc-test/semver have been updated
  • No placeholder or unstable values like *LAST or *MAX are
    included (see #3131)
  • Tested locally (cd libc-test && cargo test --target mytarget);
    especially relevant for platforms that may not be checked in CI
    I tried building the tests locally, it seems like the tests are already broken for Android. But I was able to fix some of the issues in [DO NOT MERGE] bump(main/fish): 4.0b1 termux/termux-packages#22609

@rustbot
Copy link
Collaborator

rustbot commented Dec 21, 2024

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @tgross35 (or someone else) some time within the next two weeks.

Please see the contribution instructions for more information. Namely, in order to ensure the minimum review times lag, PR authors and assigned reviewers should ensure that the review label (S-waiting-on-review and S-waiting-on-author) stays updated, invoking these commands when appropriate:

  • @rustbot author: the review is finished, PR author should check the comments and take action accordingly
  • @rustbot review: the author is ready for a review, this PR will be queued again in the reviewer's queue

@tgross35
Copy link
Contributor

Thanks for the detailed sources. Is the following delta accurate?

  • unsigned int -> unsigned short (32 -> 16) for mode_t on 32 bit
  • unsigned long -> uint64_t (32 -> 64) for ino_t on 32 bit
  • long -> int32_t (32 -> 32, nop) for off_t on 32 bit

These issues seem reasonably severe so I am surprised it has not been noticed before (@maurer do you know of anything?). I guess the other fields at least keep the relevant fields here correctly aligned?

This patch should likely also be backported to 0.2.x release, I have the patchset already applied on my other branch. Will link another PR soon. It seems that this patch applies cleanly to libc-0.2 branch, I was backporting to an older release earlier. @rustbot label stable-nominated should be it according to CONTRIBUTING.md

For reference, the command can't be in backticks :)

@rustbot label stable-nominated

@rustbot rustbot added the stable-nominated This PR should be considered for cherry-pick to libc's stable release branch label Dec 31, 2024
Copy link

@maurer maurer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixing up mode_t/ino_t might be worth doing, but the changes you're making to these structs will actively break things on certain platforms, as I've enumerated.

You can't just assume that dirent/stat & co will have the same layout they have in other environments.

You can't reset ino_t to be u64 because it got represented as u64 in dirent - it's used in other places, e.g. FTSENT where it must have its original value. While libc doesn't bind this struct, other crates do, and they use libc's ino_t type assuming that it matches the C variant.

@@ -30,7 +32,7 @@ s! {
pub st_dev: c_ulonglong,
__pad0: [c_uchar; 4],
__st_ino: crate::ino_t,
pub st_mode: c_uint,
pub st_mode: crate::mode_t,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is in fact an unsigned int and not a mode_t.

Note that stat and stat64 are actually the same on Android.

@@ -52,7 +54,7 @@ s! {
pub st_dev: c_ulonglong,
__pad0: [c_uchar; 4],
__st_ino: crate::ino_t,
pub st_mode: c_uint,
pub st_mode: crate::mode_t,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is supposed to be unsigned int, not mode_t.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made this mode_t to allow code using rust libc to just use the time mode_t to store the type like on other *nix platforms instead of having to deal with Android in other way.

@@ -528,16 +526,16 @@ s_no_extra_traits! {
}

pub struct dirent {
pub d_ino: u64,
pub d_off: i64,
pub d_ino: crate::ino_t,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not correct, ino_t is only used when LP64 is enabled (int is 32-bit, long int is 64-bit, pointers are 64-bit).

In all other cases (mostly 32-bit CPUs), this is an explicit u64, not an ino_t.

We could split dirent by architecture if we really wanted to use ino_t here, but on LP64 platforms, ino_t happens to be u64 which is why this was working.

pub d_ino: u64,
pub d_off: i64,
pub d_ino: crate::ino_t,
pub d_off: crate::off_t,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This either needs to remain unmodified or become off64_t - it is unconditionally 64-bits on android

pub d_reclen: c_ushort,
pub d_type: c_uchar,
pub d_name: [c_char; 256],
}

pub struct dirent64 {
pub d_ino: u64,
pub d_off: i64,
pub d_ino: crate::ino64_t,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment applies here - dirent64 and dirent are the same struct on Android.

@thunder-coding
Copy link
Author

You can't reset ino_t to be u64 because it got represented as u64 in dirent - it's used in other places, e.g. FTSENT where it must have its original value. While libc doesn't bind this struct, other crates do, and they use libc's ino_t type assuming that it matches the C variant.

Oh yeah, I didn't look into other uses of ino_t in FTSENT. I don't think I am capable of taking this PR further. This is just way too complicated for me. Feel free to take over this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
O-android O-riscv O-unix O-x86 S-waiting-on-review stable-nominated This PR should be considered for cherry-pick to libc's stable release branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants