Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compilation of curve25519-dalek is extremely slow with certain flags used by cargo-fuzz #95240

Open
ruuda opened this issue Mar 23, 2022 · 8 comments
Labels
A-incr-comp Area: Incremental compilation C-bug Category: This is a bug. E-needs-investigation Call for partcipation: This issues needs some investigation to determine current status E-needs-mcve Call for participation: This issue has a repro, but needs a Minimal Complete and Verifiable Example I-compiletime Issue: Problems and improvements with respect to compile times. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@ruuda
Copy link
Contributor

ruuda commented Mar 23, 2022

Compare these two, a normal --release compile of curve25519-dalek 3.2.0:

$ cargo clean
$ perf stat cargo +nightly-2022-03-22 build --release
(...)
 Performance counter stats for 'cargo +nightly-2022-03-22 build --release':

         13,532.26 msec task-clock:u              #    2.244 CPUs utilized          
                 0      context-switches:u        #    0.000 /sec                   
                 0      cpu-migrations:u          #    0.000 /sec                   
           396,009      page-faults:u             #   29.264 K/sec                  
    44,391,387,755      cycles:u                  #    3.280 GHz                    
    73,252,531,349      instructions:u            #    1.65  insn per cycle         
    15,267,527,160      branches:u                #    1.128 G/sec                  
       252,263,876      branch-misses:u           #    1.65% of all branches        

       6.030274305 seconds time elapsed

      12.564481000 seconds user
       0.965248000 seconds sys

And now with some particular flags (these are some of the flags that cargo-fuzz adds):

$ cargo clean
$ perf stat cargo +nightly-2022-03-22 rustc -- --cfg fuzzing -Cpasses=sancov-module -Clink-dead-code -Zsanitizer=address -Cllvm-args=-sanitizer-coverage-trace-compares -C codegen-units=1 -C opt-level=3
(...)
 Performance counter stats for 'cargo +nightly-2022-03-22 rustc -- --cfg fuzzing -Cpasses=sancov-module -Clink-dead-code -Zsanitizer=address -Cllvm-args=-sanitizer-coverage-trace-compares -C codegen-units=1 -C opt-level=3':

        637,839.23 msec task-clock:u              #    1.007 CPUs utilized          
                 0      context-switches:u        #    0.000 /sec                   
                 0      cpu-migrations:u          #    0.000 /sec                   
           644,865      page-faults:u             #    1.011 K/sec                  
 2,219,950,777,451      cycles:u                  #    3.480 GHz                    
 3,692,455,090,375      instructions:u            #    1.66  insn per cycle         
   789,078,921,069      branches:u                #    1.237 G/sec                  
     2,231,900,259      branch-misses:u           #    0.28% of all branches        

     633.489309090 seconds time elapsed

     635.494820000 seconds user
       1.518585000 seconds sys

It takes more than 100× as long as a regular release build. This time is spent compiling curve25519-dalek itself, the dependencies compile within seconds. I initially discovered this when trying to fuzz a package that transitively depends on curve25519-dalek. I haven’t yet tried to minimize the set of flags to see if there is one in particular that is the culprit.

To reproduce, clone https://github.com/dalek-cryptography/curve25519-dalek and check out tag 3.2.0. I’m running this on x86_64 Linux.

Meta

rustc +nightly-2022-03-22 --version --verbose:

rustc 1.61.0-nightly (3c17c84a3 2022-03-21)
binary: rustc
commit-hash: 3c17c84a386e7badf6b2c6018d172496b3a28a04
commit-date: 2022-03-21
host: x86_64-unknown-linux-gnu
release: 1.61.0-nightly
LLVM version: 14.0.0

I can also reproduce with nightly 2022-03-15. I haven’t tried any others so far.

@ruuda ruuda added the C-bug Category: This is a bug. label Mar 23, 2022
@ruuda
Copy link
Contributor Author

ruuda commented Mar 23, 2022

After some experimentation, this combination of flags alone results in about a 10× slowdown (60 seconds elapsed), but not the 100× for the full set of flags. I haven’t tried all permutations because it takes ~10 minutes every time.

-C codegen-units=1 -C opt-level=3

@the8472
Copy link
Member

the8472 commented Mar 23, 2022

Setting CGUs to 1 reduces parallelization opportunities, a 10x slowdown in wall-time wouldn't be surprising, but the user-time shouldn't go up significantly. The 50x increase in user-time probably comes from the other parameters.

@saethlin
Copy link
Member

Some time ago I mentioned the same thing on the zulip, and the response was basically crickets and "that's expected" 🤷 https://rust-lang.zulipchat.com/#narrow/stream/187780-t-compiler.2Fwg-llvm/topic/Compile.20time.20blowup.20associated.20with.20asan.3F/near/262968197

For what it's worth, I don't think this is okay, but I suspect that the real problem here is an interaction between address sanitizer and optimization passes.

@lqd
Copy link
Member

lqd commented Mar 25, 2022

For what it's worth, I don't think this is okay, but I suspect that the real problem here is an interaction between address sanitizer and optimization passes.

Any of these changes makes the slowdown smaller:

  • removing asan
  • having more than 1 CGU
  • having a lower opt-level

Here's a breakdown of the slowest combination from the OP: it seems to be coming from LLVM, so maybe opening an issue there could be interesting ?

@ruuda
Copy link
Contributor Author

ruuda commented Mar 25, 2022

The project in which I discovered this depends on ~700 crates, and curve25519-dalek is the one that stands out as the one that takes ages to build. Of course some compilation slowdown is expected with these flags, but curve25519-dalek is affected disproportionally by a large margin, I suspect it triggers some pathological case, maybe something accidentally quadratic somewhere?

it seems to be coming from LLVM, so maybe opening an issue there could be interesting

Thanks for timing this. Yeah, that makes sense. Is there anything we can do to write a better bug report for LLVM? Dump the IR emitted by rustc and reproduce the slowdown with just opt or something like that?

@lqd
Copy link
Member

lqd commented Mar 25, 2022

I’m not sure myself but maybe our LLVM expert @nikic could help answer such a question ?

@sgued
Copy link
Contributor

sgued commented Aug 17, 2022

It seems that -C incremental=... is the cause of the slowness. See this repo which contains the example file found here and two scripts, slow.sh and fast.sh which have for only difference the -C incremental=..

On my machine, with rustc 1.65.0-nightly (86c6ebee8 2022-08-16), the compilation time goes from 0.5s to more than 50s

@the8472 the8472 added the I-compiletime Issue: Problems and improvements with respect to compile times. label Jan 29, 2024
@oherrala
Copy link

oherrala commented Nov 6, 2024

I hit this same issue and thanks to #95240 (comment) I was able to speed up fuzzer compilation drastically with

run_fuzzer() {
    CARGO_PROFILE_RELEASE_LTO=false \
    CARGO_INCREMENTAL=0 \
    cargo +nightly fuzz run "$1" -j4 -- -max_total_time="${2}" -print_final_stats=1
}

And disabling LTO is due to rust-fuzz/cargo-fuzz#384

@jieyouxu jieyouxu added T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. E-needs-mcve Call for participation: This issue has a repro, but needs a Minimal Complete and Verifiable Example E-needs-investigation Call for partcipation: This issues needs some investigation to determine current status A-incr-comp Area: Incremental compilation and removed needs-triage-legacy labels Nov 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-incr-comp Area: Incremental compilation C-bug Category: This is a bug. E-needs-investigation Call for partcipation: This issues needs some investigation to determine current status E-needs-mcve Call for participation: This issue has a repro, but needs a Minimal Complete and Verifiable Example I-compiletime Issue: Problems and improvements with respect to compile times. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

8 participants