-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Emscripten metadata file size is wrong #131467
Emscripten metadata file size is wrong #131467
Comments
uhh does this reproduce on the latest nightly? |
Yes, unfortunately it still reproduces on My current workaround is the following: - let bytes = std::fs::read(path)?;
+ struct FileNoSize(std::fs::File);
+
+ impl std::io::Read for FileNoSize {
+ fn read(&mut self, buf: &mut [u8]) -> std::io::Result<usize> {
+ self.0.read(buf)
+ }
+ }
+
+ let mut file = FileNoSize(std::fs::File::open(path)?);
+
+ let mut bytes = Vec::new();
+ std::io::Read::read_to_end(&mut file, &mut bytes)?; |
It seems we just call fstat64? It's not clear how that's not working out for us. |
We could have an incorrect definition of the Emscripten stat64 structure? |
emscripten's definition of |
Pyodide v0.25 uses Emscripten 3.1.46, so that would be after the change. The fix to libc was in rust-lang/libc@63b0d67 and released in v0.2.148, which Rust is already using. Huh. |
When building with a fixed version of |
Do you think that using |
Sure, give it a try! |
Yes, How could this footgun be avoided? |
Not sure; maybe the emscripten version used in |
Perhaps we could add some warning to the target list about this issue? Could someone point me in the direction of which file to edit? Cc @hoodmane: Pyodide packages using Rust should be built with build-std to ensure that they use the correct Emscripten ABI, otherwise e.g. file size metadata is garbage |
@juntyr Here, you would need to add an entry because the emscripten target doesn't have a doc for it yet, but you can add appropriate caveats to this page and in this folder: |
@kleisauke It seems you needed to also update https://github.com/rust-lang/rust/blob/master/src/ci/docker/scripts/emscripten.sh to finish the patch |
Ah, I think this is also the reason why #116655 failed without I just opened (draft) PR rust-lang/libc#3962 for this in libc to bump emsdk and make the changes in rust-lang/libc@63b0d67 unconditional. |
Thanks @juntyr for reporting this! I will add a test for this to Pyodide.
Yeah does sound like we should add Seems like I got rid of It would be really nice if Emscripten could adopt an abi versioning scheme so that other projects could know what is going on. |
Otherwise we are linking a copy of the standard library built with some particular version of emscripten selected by the rust compiler that does not match the version we are using. This can lead to an ABI mismatch. See rust-lang/rust#131467
By the way, is it possible for us to build the rust standard library once against our emscripten compiler and point all packages we build to this one copy of the rust stdlib? That way we would only have to rebuild when we change the version of rust or emscripten as opposed to every time we build any rust package. It would save a lot of CI time. |
When building the Rust crates, you could pass I'm not sure how to prepare the sysroot itself, but there is a crate https://crates.io/crates/cargo-sysroot that supports building with custom sysroots, so it might be useful to look at. |
…ersion, r=Kobzol emscripten: Use the latest emsdk 3.1.68 This should fix our precompiled std being unsound in `std::fs` code. Should resolve rust-lang#131467 I hope.
WG-prioritization assigning priority (Zulip discussion). @rustbot label -I-prioritize +P-high |
Rollup merge of rust-lang#131533 - workingjubilee:update-emscripten-version, r=Kobzol emscripten: Use the latest emsdk 3.1.68 This should fix our precompiled std being unsound in `std::fs` code. Should resolve rust-lang#131467 I hope.
…, r=jieyouxu Add wasm32-unknown-emscripten platform support document This PR adds the platform support document for wasm32-unknown-emscripten, and adds a warning about breaks in Emscripten ABI compatibility (see rust-lang#131467). I mostly based the document off the wasm32-unknown-unknown docs and some of the information may still be missing (e.g. who's the target maintainer) or outdated (e.g. the build requirements). I still hope that it provides a good starting point. r? `@workingjubilee`
Rollup merge of rust-lang#131582 - juntyr:emscripten-platform-support, r=jieyouxu Add wasm32-unknown-emscripten platform support document This PR adds the platform support document for wasm32-unknown-emscripten, and adds a warning about breaks in Emscripten ABI compatibility (see rust-lang#131467). I mostly based the document off the wasm32-unknown-unknown docs and some of the information may still be missing (e.g. who's the target maintainer) or outdated (e.g. the build requirements). I still hope that it provides a good starting point. r? `@workingjubilee`
PR #131533 unfortunately didn't resolve the issue. https://github.com/rust-lang/rust/blob/master/src/ci/docker/scripts/emscripten.sh looks like a remnant of the now-removed We may need to either install emsdk in this Dockerfile: Or alternatively, always enforce the Standalone reproducer: $ curl https://sh.rustup.rs -sSf | sh -s -- -y --profile minimal --target wasm32-unknown-emscripten --default-toolchain nightly-2024-12-09 --component rust-src
$ source "$HOME/.cargo/env"
$ cargo new foo
$ cd foo
$ cat <<EOT > src/main.rs
use std::io;
use std::fs;
fn main() -> io::Result<()> {
let metadata = fs::metadata("hello.txt")?;
println!("{}", metadata.len());
Ok(())
}
EOT
$ RUSTFLAGS="-Clink-arg=-sNODERAWFS" cargo build --release -Zbuild-std=panic_abort,std --target wasm32-unknown-emscripten
$ echo "Hello, world!" > hello.txt
$ node target/wasm32-unknown-emscripten/release/foo.js
14
$ RUSTFLAGS="-Clink-arg=-sNODERAWFS" cargo build --release --target wasm32-unknown-emscripten
$ node target/wasm32-unknown-emscripten/release/foo.js
1733771096 |
does this still repro on nightly? I am getting a report this still repros on 1.84.0 and I checked and it seems to, but I had to do so many environment tweaks to get emcc and node.js in my environment I'm not sure. I thought this should have been at least included by #134502 landing? Is it fixed on beta? |
@workingjubilee, I've triggered a run against the 2025-01-11 nightly build in pyodide/pyodide#5249 – let's see how it goes |
…mulacrum Remove emsdk version update from 1.84.0 relnotes See [this comment](rust-lang#131467 (comment)). The reproducer in that comment does indeed show that rustup's `rust-std` component is still compiled with the old emscripten ABI because libc's config flag `emscripten_new_stat_abi` is not set. rust-lang#131533 presumably had no effect because the wrong CI file was modified. So nothing has changed since 1.83.0. The PR author (workingjubilee) is currently on vacation. Also the issue rust-lang#131467 should be reopened.
…mulacrum Remove emsdk version update from 1.84.0 relnotes See [this comment](rust-lang#131467 (comment)). The reproducer in that comment does indeed show that rustup's `rust-std` component is still compiled with the old emscripten ABI because libc's config flag `emscripten_new_stat_abi` is not set. rust-lang#131533 presumably had no effect because the wrong CI file was modified. So nothing has changed since 1.83.0. The PR author (workingjubilee) is currently on vacation. Also the issue rust-lang#131467 should be reopened.
Rollup merge of rust-lang#135266 - kadiwa4:no_emsdk_update, r=Mark-Simulacrum Remove emsdk version update from 1.84.0 relnotes See [this comment](rust-lang#131467 (comment)). The reproducer in that comment does indeed show that rustup's `rust-std` component is still compiled with the old emscripten ABI because libc's config flag `emscripten_new_stat_abi` is not set. rust-lang#131533 presumably had no effect because the wrong CI file was modified. So nothing has changed since 1.83.0. The PR author (workingjubilee) is currently on vacation. Also the issue rust-lang#131467 should be reopened.
This revises commit 63b0d67 to assume that Emscripten 3.1.42 or later is being used whenever `emcc` is not available. Since Emscripten 3.1.42 was released on June 23, 2023, the majority of users are expected to have upgraded to a more recent version. Resolves: rust-lang/rust#131467.
I just opened PR rust-lang/libc#4243 for this. The compatibility notes removed in #135266 could be reused in a future Rust version that includes this, potentially 1.86.0. |
…Kobzol Remove remnant of asmjs See: rust-lang#131467 (comment).
…Kobzol Remove remnant of asmjs See: rust-lang#131467 (comment).
Rollup merge of rust-lang#135476 - kleisauke:remove-asmjs-remnant, r=Kobzol Remove remnant of asmjs See: rust-lang#131467 (comment).
I tried this code:
for a file that is <1MB large.
I expected to see this happen: The read should allocate a vec that is as large as the file content. The heap size in Emscripten should not meaningfully change.
Instead, this happened: The Emscripten heap grows by 1.8GB
I looked into the implementation of fs::read. Since it uses the file metadata size to pre-allocate the output vector, I debug printed the file size, which is around 1.8GB.
Since my Rust code is running inside Pyodide 0.25, I also checked Python's os.stat of the file, which reports the correct file size. Reading the file in Python also succeeds without the enormous over-allocation. It seems that this is a Rust-specific bug in getting the correct file size information from Emscripten.
Meta
My Rust toolchain is pinned to nightly-2024-07-21, the last nightly for 1.81.
The text was updated successfully, but these errors were encountered: