Intended audience
  • Prospective-to-intermediate Rust developers who worry about slow compilation times, the scale thereof, and preventative practices.
  • Advanced Rust developers who can help me improve my own compilation times.
Origin My experience with Rust.
Mood Somewhat successful; somewhat disappointed.

Rust is known for slow compilation times. I spent a long time trying to improve incremental test build times for my project git-branchless in https://github.com/arxanas/git-branchless/pull/650. This is a discussion of the results.

Executive summary

  • Incremental testing refers only to changing the integration test code and rebuilding. The source code remains unchanged.
  • In the end, I was able to reduce incremental test time from ~6.9sec to ~1.7sec (~4x). Other techniques to improve compilation time produced marginal or no improvements.

For your reference, hereā€™s the best articles for conceptual understanding of the Rust build model and improving compilation time:

Project details

Hereā€™s how big my project git-branchless was before the pull request:

  • git-branchless/src: 12060 lines.
  • git-branchless/tests: 12897 lines.
    • Note that it relies heavily on snapshot testing, so most of these lines of code are multiline string literals.
  • git-branchless-lib/src: 12406 lines.

Itā€™s permissible that compilation and linking time should be slow, but in order to optimize my development feedback loop, I need test times to be fast. Specifically, I want to be able to iterate on the development of a single certain test. (IntelliJ has a nice feature to automatically re-run a given test when there are source changes, but the utility is diminished when it takes too long to recompile the test.)

To start, building the test_amend binary (which tests the amend subcommand) after a single comment addition to the test file itself (no library changes!) takes ~6.9sec:

$ hyperfine --warmup 3 --prepare 'echo "// @nocommit test" >>git-branchless/tests/command/test_amend.rs' 'cargo test --test mod --no-run'   
Benchmark 1: cargo test --test mod --no-run
  Time (mean Ā± Ļƒ):      6.927 s Ā±  0.123 s    [User: 7.652 s, System: 1.738 s]
  Range (min ā€¦ max):    6.754 s ā€¦  7.161 s    10 runs

Staggering! This is not a large project, and weā€™re only making changes to the tests, so it shouldnā€™t require so much iteration time.

Splitting into more crates

In the pull request, I extract code into an additional nine crates, resulting in an incremental test build time of ~1.7sec (~4x improvement):

$ hyperfine --warmup 3 --prepare 'echo "// @nocommit test" >>git-branchless/tests/test_amend.rs' 'cargo test --test test_amend --no-run'   
Benchmark 1: cargo test --test test_amend --no-run
  Time (mean Ā± Ļƒ):      1.771 s Ā±  0.012 s    [User: 1.471 s, System: 0.330 s]
  Range (min ā€¦ max):    1.750 s ā€¦  1.793 s    10 runs

In relative terms, this is a significant improvement, but in absolute terms, this is rather poor, in my opinion. I would expect 100-200ms to parse, expand macros, typecheck, and generate code (with no optimizations) for a file of this size (~1k lines, mostly long string values).

Furthermore, splitting into multiple crates makes it harder to distribute my project via crates.io:

  • Each crate must individually be published to crates.io (you cannot publish a crate with git dependencies to crates.io)
    • So I have to manage versioning and licensing of each crate.
  • According to my user survey, the majority of my users install via cargo. For the question ā€œHow did you install git-branchless?ā€, the responses are as follows (as of 2023-02-07):
    • 7/18 (38.9%): via cargo install git-branchless
    • 4/18 (22.2%): via traditional system package manager
    • 4/18 (22.2%): via Nix or NixOS
    • 2/18 (11.1%): via cloning the repository and manually building and installing
    • 1/18 (5.6%): via GitHub Actions build artifact

Itā€™s unfortunate that I have to expose internal modules publicly on crates.io just to get reasonable compilation times.

No-op timing

At this point, I measure the no-op time to be ~350ms for a smaller test crate with few dependencies:

$ hyperfine --warmup 3 'cargo test -p git-branchless-test --no-run'      	 
Benchmark 1: cargo test -p git-branchless-test --no-run
  Time (mean Ā± Ļƒ): 	344.2 ms Ā±   3.5 ms	[User: 246.2 ms, System: 91.9 ms]
  Range (min ā€¦ max):   340.4 ms ā€¦ 351.0 ms	10 runs

This is surprising. I would expect the overhead for a no-op build to be similar to git status, maybe 15ms:

$ hyperfine --warmup 3 'git status'
Benchmark 1: git status
  Time (mean Ā± Ļƒ):  	13.5 ms Ā±   2.5 ms	[User: 4.9 ms, System: 6.1 ms]
  Range (min ā€¦ max):	11.1 ms ā€¦  24.7 ms	197 runs

There might be some kind of deeper issue here. The cargo nextest documentation warns that some anti-malware systems can introduce artificial startup latency when checking executables:

A typical sign of this happening is even the simplest of tests in cargo nextest run taking more than 0.2 seconds.

As per the documentation, I marked my terminal software as ā€œDeveloper Toolsā€ under macOS, but couldnā€™t reduce the no-op compilation time.

No more profiling?

I tried with a subcommand crate that I made recently which should still have few dependencies:

$ hyperfine --warmup 3 --prepare 'echo "// @nocommit test" >>git-branchless-test/tests/test_test.rs' 'cargo test -p git-branchless-test --no-run'
Benchmark 1: cargo test -p git-branchless-test --no-run
  Time (mean Ā± Ļƒ):  	1.855 s Ā±  0.034 s	[User: 1.476 s, System: 0.335 s]
  Range (min ā€¦ max):	1.831 s ā€¦  1.939 s	10 runs

The cargo build timings for the incremental build doesnā€™t help. It just shows that the test Iā€™m building takes 100% of the spent time.

The timing graph for building the `git-branchless-test` crate.

The timing graph for building the git-branchless-test crate.

No more ideas?

Some ideas that didnā€™t work:

  • Combining integration tests into a single binary. (Iā€™m happy to run individual binaries as necessary, anyways.)
  • Reducing the top monomorphization sites them (AsRef calls, etc. appeared in cargo llvm-lines, but reducing them didnā€™t seem to improve compilation times).
  • Using sccache.
  • Using cargo-nextest.
  • Using zld/mold/sold as the linker.
  • Setting profile.dev.split-debuginfo = ā€œunpackedā€ (for macOS).
  • Setting profile.dev.build-override.opt-level = 3.
    • Unfortunately, there are lots of procedural macros, particularly tracingā€™s #[instrument].
  • Setting profile.dev.debug = 0. This actually does save ~20ms, but itā€™s not enough of an improvement by itself.

So now Iā€™m stuck. Thatā€™s the most I can improve Rust incremental test times. Let me know if you have any other ideas.

The following are hand-curated posts which you might find interesting.

Want to see more of my posts? Follow me on Twitter or subscribe via RSS.

Comments