Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add BLAKE3 hashing algorithm via Rust interop #12416

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

silvanshade
Copy link
Contributor

@silvanshade silvanshade commented Feb 4, 2025

Motivation

This uses the high-performance multi-threaded Rust-based routines from the blake3 crate.

I believe this is the more optimal way to implement BLAKE3 support in Nix in comparison to the approach used in #12379

The downside to implementing BLAKE3 via Rust interop is that some additional supporting framework needs to be defined in order to integrate Rust into the build system.

This turned out to not be as difficult as I anticipated (see #11999 (comment)) and was actually simpler to re-implement with the newer Meson build system versus the original version I implemented (but never released) which used the GNU tooling.

One advantage to this approach is it illustrates a pattern that can be used to integrate further Rust code into the project.

Indeed, in my earlier comments I hinted at working on a bidirectional binding interface which also exposed the Nix C++ API to Rust, but I have purposely kept that out of this PR for simplicitly sake; just mentioning it as a future possibility.

@Ericson2314

Context

See #10600 #11999 #12379

Design Considerations

The main goal when exposing the BLAKE3 interface from Rust to C++ was to maintain a safe interface across the board. This means avoiding raw pointers and instead working with references and smart pointers and the definitions exposed under the ::rust namespace (like boxes and slices) via cxx and rust/cxx.h.

This led a complication with how to integrate the BLAKE3 hasher structure into Ctx as a union. The problem with using a union is we would have to delay initialization of the BLAKE3 hasher context. This means we would need to wrap the context in something like an std::unique_ptr<std::optional<BLAKE3Ctx>>. This isn't exactly the worst thing ever but it would mean unnecessary checks for every useful operation.

Instead, I've opted to redesign Ctx as a class HashCtx. This also allowed me to introduce an extra method on the BLAKE3Ctx subclass: BLAKE3Ctx::update_mmap.

The mmap method is used specifically when reading from files to access the highest performance backend in the BLAKE3 crate.

This also ties into another design issue: the chunking uses by the Nix IO routines for hashing inhibit the BLAKE3 routines from reaching full performance because the chunk sizes are too small to take advantage of as much parallelism as possible.

In order to work around this, I had to refactor some of the code around various readFile calls which operate on sinks to instead invert control to where, instead of passing the sink into the function as an argument, readFile is defined as a method on Sink, which can be overloaded in the case of HashSink to bypass the normal Sink chunking and instead let the blake3 crate handle the IO directly via the mmap routines.

Due to the complexity of all the different IO calls, I did not fully replace all of the readFile variants this way (e.g., SourceAccessor, PosixSourceAccessor, LocalStoreAccessor, etc), so some areas of the code that potentially deal with hashing still will not use this fast path without additional work.

There may be a better way to do this and I'd be happy to refactor the code if anyone has suggestions in that regard.

Benchmarks

Built with the following:
nix develop .#native-clangStdenv
configurePhase
buildPhase
checkPhase
installPhase

Config

CPU: AMD Ryzen 9 7950X 16-Core overclocked to 5.88 Ghz
RAM: 96GB @ 6400 MT/s (tCL: 28)
OS: CachyOS February 2025 release w/ bpfland scx

Benchmarks all used the following:

hyperfine --warmup 3 './outputs/out/bin/nix hash file --type <algo> <file>'

100K file

BLAKE3 (C)

Benchmark 1: ./outputs/out/bin/nix hash file --type blake3 ~/100K.bin
  Time (mean ± σ):       9.5 ms ±   0.1 ms    [User: 5.6 ms, System: 3.6 ms]
  Range (min … max):     9.1 ms …  10.0 ms    299 runs

BLAKE3 (Rust)

Benchmark 1: ./outputs/out/bin/nix hash file --type blake3 ~/100K.bin
  Time (mean ± σ):       9.6 ms ±   0.9 ms    [User: 5.9 ms, System: 3.3 ms]
  Range (min … max):     9.2 ms …  24.7 ms    285 runs

SHA256

Benchmark 1: ./outputs/out/bin/nix hash file --type sha256 ~/100K.bin
  Time (mean ± σ):       9.6 ms ±   0.9 ms    [User: 5.7 ms, System: 3.5 ms]
  Range (min … max):     9.0 ms …  23.1 ms    290 runs

SHA512

Benchmark 1: ./outputs/out/bin/nix hash file --type sha512 ~/100K.bin
  Time (mean ± σ):       9.6 ms ±   0.9 ms    [User: 5.8 ms, System: 3.3 ms]
  Range (min … max):     9.0 ms …  23.3 ms    288 runs

10M file

BLAKE3 (C)

Benchmark 1: ./outputs/out/bin/nix hash file --type blake3 ~/10M.bin
  Time (mean ± σ):      11.9 ms ±   2.6 ms    [User: 6.7 ms, System: 4.5 ms]
  Range (min … max):    11.0 ms …  32.8 ms    240 runs

BLAKE3 (Rust)

Benchmark 1: ./outputs/out/bin/nix hash file --type blake3 ~/10M.bin
  Time (mean ± σ):      12.9 ms ±   1.1 ms    [User: 9.9 ms, System: 22.7 ms]
  Range (min … max):    11.3 ms …  16.6 ms    215 runs

SHA256

Benchmark 1: ./outputs/out/bin/nix hash file --type sha256 ~/10M.bin
  Time (mean ± σ):      13.9 ms ±   0.4 ms    [User: 9.4 ms, System: 4.1 ms]
  Range (min … max):    13.3 ms …  16.3 ms    201 runs

SHA512

Benchmark 1: ./outputs/out/bin/nix hash file --type sha512 ~/10M.bin
  Time (mean ± σ):      18.2 ms ±   0.5 ms    [User: 13.3 ms, System: 4.4 ms]
  Range (min … max):    17.4 ms …  21.6 ms    162 runs

100M file

BLAKE3 (C)

Benchmark 1: ./outputs/out/bin/nix hash file --type blake3 ~/100M.bin
  Time (mean ± σ):      26.2 ms ±   0.8 ms    [User: 17.0 ms, System: 8.7 ms]
  Range (min … max):    24.9 ms …  29.1 ms    111 runs

BLAKE3 (Rust)

Benchmark 1: ./outputs/out/bin/nix hash file --type blake3 ~/100M.bin
  Time (mean ± σ):      17.5 ms ±   1.6 ms    [User: 33.9 ms, System: 41.7 ms]
  Range (min … max):    15.2 ms …  23.8 ms    128 runs

SHA256

Benchmark 1: ./outputs/out/bin/nix hash file --type sha256 ~/100M.bin
  Time (mean ± σ):      54.1 ms ±   0.5 ms    [User: 44.0 ms, System: 9.5 ms]
  Range (min … max):    53.4 ms …  55.5 ms    55 runs

SHA512

Benchmark 1: ./outputs/out/bin/nix hash file --type sha512 ~/100M.bin
  Time (mean ± σ):      96.1 ms ±   0.9 ms    [User: 85.8 ms, System: 9.4 ms]
  Range (min … max):    95.1 ms …  98.5 ms    31 runs

300M file

BLAKE3 (C)

Benchmark 1: ./outputs/out/bin/nix hash file --type blake3 ~/300M.bin
  Time (mean ± σ):      59.2 ms ±   0.9 ms    [User: 37.7 ms, System: 20.8 ms]
  Range (min … max):    57.8 ms …  61.9 ms    49 runs

BLAKE3 (Rust)

Benchmark 1: ./outputs/out/bin/nix hash file --type blake3 ~/300M.bin
  Time (mean ± σ):      26.0 ms ±   1.6 ms    [User: 85.6 ms, System: 65.8 ms]
  Range (min … max):    22.8 ms …  31.1 ms    104 runs

SHA256

Benchmark 1: ./outputs/out/bin/nix hash file --type sha256 ~/300M.bin
  Time (mean ± σ):     139.6 ms ±   0.8 ms    [User: 116.4 ms, System: 22.5 ms]
  Range (min … max):   138.1 ms … 141.0 ms    21 runs

SHA512

Benchmark 1: ./outputs/out/bin/nix hash file --type sha512 ~/300M.bin
  Time (mean ± σ):     263.5 ms ±   3.0 ms    [User: 238.0 ms, System: 22.9 ms]
  Range (min … max):   260.4 ms … 269.5 ms    11 runs

1G file

BLAKE3 (C)

Benchmark 1: ./outputs/out/bin/nix hash file --type blake3 ~/1G.bin
  Time (mean ± σ):     190.9 ms ±   1.5 ms    [User: 113.6 ms, System: 76.1 ms]
  Range (min … max):   188.8 ms … 194.3 ms    15 runs

BLAKE3 (Rust)

Benchmark 1: ./outputs/out/bin/nix hash file --type blake3 ~/1G.bin
  Time (mean ± σ):      52.5 ms ±   5.0 ms    [User: 304.9 ms, System: 114.4 ms]
  Range (min … max):    50.0 ms …  88.9 ms    58 runs

SHA256

Benchmark 1: ./outputs/out/bin/nix hash file --type sha256 ~/1G.bin
  Time (mean ± σ):     465.0 ms ±   4.5 ms    [User: 384.8 ms, System: 77.4 ms]
  Range (min … max):   461.8 ms … 477.0 ms    10 runs

SHA512

Benchmark 1: ./outputs/out/bin/nix hash file --type sha512 ~/1G.bin
  Time (mean ± σ):     877.5 ms ±   8.9 ms    [User: 795.5 ms, System: 77.3 ms]
  Range (min … max):   870.8 ms … 900.8 ms    10 runs

20G file

BLAKE3 (C)

Benchmark 1: ./outputs/out/bin/nix hash file --type blake3 ~/20G.bin
  Time (mean ± σ):      3.155 s ±  0.009 s    [User: 2.236 s, System: 0.914 s]
  Range (min … max):    3.143 s …  3.168 s    10 runs

BLAKE3 (Rust)

Benchmark 1: ./outputs/out/bin/nix hash file --type blake3 ~/20G.bin
  Time (mean ± σ):     574.1 ms ±   9.8 ms    [User: 8339.7 ms, System: 1430.7 ms]
  Range (min … max):   563.6 ms … 596.4 ms    10 runs

SHA256

Benchmark 1: ./outputs/out/bin/nix hash file --type sha256 ~/20G.bin
  Time (mean ± σ):      8.756 s ±  0.011 s    [User: 7.812 s, System: 0.933 s]
  Range (min … max):    8.737 s …  8.767 s    10 runs

SHA512

Benchmark 1: ./outputs/out/bin/nix hash file --type sha512 ~/20G.bin
  Time (mean ± σ):     17.280 s ±  0.077 s    [User: 16.301 s, System: 0.954 s]
  Range (min … max):   17.220 s … 17.395 s    10 runs

64G file

BLAKE3 (C)

Benchmark 1: ./outputs/out/bin/nix hash file --type blake3 ~/64G.bin
  Time (mean ± σ):     17.145 s ±  0.086 s    [User: 7.154 s, System: 9.578 s]
  Range (min … max):   17.018 s … 17.276 s    10 runs

BLAKE3 (Rust)

Benchmark 1: ./outputs/out/bin/nix hash file --type blake3 ~/64G.bin
  Time (mean ± σ):      1.822 s ±  0.011 s    [User: 26.895 s, System: 4.875 s]
  Range (min … max):    1.802 s …  1.832 s    10 runs

SHA256

Benchmark 1: ./outputs/out/bin/nix hash file --type sha256 ~/64G.bin
  Time (mean ± σ):     27.455 s ±  0.066 s    [User: 24.323 s, System: 3.072 s]
  Range (min … max):   27.343 s … 27.554 s    10 runs

SHA512

Benchmark 1: ./outputs/out/bin/nix hash file --type sha512 ~/64G.bin
  Time (mean ± σ):     53.807 s ±  0.212 s    [User: 50.615 s, System: 3.118 s]
  Range (min … max):   53.446 s … 54.187 s    10 runs

@github-actions github-actions bot added new-cli Relating to the "nix" command store Issues and pull requests concerning the Nix store labels Feb 4, 2025
@edolstra
Copy link
Member

edolstra commented Feb 4, 2025

Not necessarily opposed to using Rust for leaf code like hashing algorithms, but it's worth mentioning why we removed Rust the last time (759947b, #5987):

  • C++/Rust interop isn't very good, and we don't want to spend huge amounts of time writing tedious and inefficient FFI/marshalling code.
  • Writing parts of Nix in Rust means that only people who know C++ and Rust can contribute, which is a strictly smaller set of people than C++ developers.
  • It adds a big dependency to the Nix build chain. This is probably less of an issue now since we already depend on some stuff written in Rust (like mdbook).

@Ericson2314
Copy link
Member

Per #12379 (comment) let's

  1. Land the C one first experimental
  2. Wait for people to start using (as I hope they will!)
  3. Make it experimental
  4. Then consider this.

More usage -> easier to justify the overhead of going polyglot :)

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/2024-02-12-nix-team-meeting-minutes-212-5/60216/1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new-cli Relating to the "nix" command store Issues and pull requests concerning the Nix store
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants