-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Symlink cargo build artifacts instead of copying them #322
base: master
Are you sure you want to change the base?
Conversation
Have you seen #76 ? |
I gave it a spin, but realized it can't help me as I'm running with:
for reasons explained in #76. I turned it off, but then discovered that after building dependencies Nix does bunch of I'm not sure if So I can't tell how much does it improve things. |
It seems like the problem is not there in
and again I'm seeing rebuilds of dependencies. So either your change is causing it too, or I messed up something somewhere. |
I tried again (to double check) just |
No, I must've overlooked #76, thanks for mentioning.
Hmm weird, is maybe the host system relevant? It doesn't do that for me, it's shrinking the files, but that doesn't seem to matter and happens also on master. Can you provide more info? Maybe try one of the examples, if it's working there for you (e.g. |
I'm using fenix toolchain. I reported #323 , maybe it will shed some light. |
Hi @Philipp-M thank you for the PR! When I originally explored symlinking all artifacts (instead of copying them deeply) cargo would fail with some errors about trying to write to a read-only file system (i.e. the nix store), but I hadn't considered that this might not apply to all artifacts, so kudos to spotting that!
I don't use Noting @dpc 's comment on #323 (please correct me if I'm misunderstanding or misrepresenting anything)
Sounds like these changes (as they stand currently) are leading to rebuilds. My first guess is that cargo may be getting confused by looking at artifacts whose timestamp is older than the sources (sources in the Nix store have a timestamp in 1970) and it is resulting in rebuilding them. I wonder if the artifacts should be hardlinked (not symlinked) so that they get their own timestamp instead of inheriting the previous one... |
if [ -d "${artifacts}/release/deps" ]; then | ||
mkdir -p "${target}/release/deps" | ||
for dep in $(ls "${artifacts}/release/deps"); do | ||
ln -fs "${artifacts}/release/deps/$dep" "${target}/release/deps/$dep" | ||
done | ||
fi | ||
|
||
if [ -d "${artifacts}/release/build" ]; then | ||
mkdir -p "${target}/release/build" | ||
for build in $(ls "${artifacts}/release/build"); do | ||
ln -fs "${artifacts}/release/build/$build" "${target}/release/build/$build" | ||
done | ||
fi | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing worth noting is that this bit of logic is assuming that only release
artifacts are being built. Although this is the default build behavior we apply, it's entirely possible a derivation can be configured to build any other profile (including custom ones).
Also the derivation could also be building for different targets (though I believe this is captured by the wildcards). I think a more robust approach might be to find out exactly which files cargo will tolerate as symlinks/hardlinks and use that instead (e.g. *.rlib
, *.rmeta
or */build/*
, etc.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ohhh. I 99% was testing it with debug
build, as our crane-based derivations default to debug-inherited profile. That would explain the rebuilds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a more robust approach might be to find out exactly which files cargo will tolerate as symlinks/hardlinks and use that instead (e.g. *.rlib, *.rmeta or /build/, etc.)
After a little bit of research, I think it's safer to just symlink the content-addressed artifacts (all files under build
and deps
), these are the artifacts that take most (almost all the) disk-space and I had some weird issues when testing something like this (create dirs recursively and then link every file from the nix store cargo target to the build target dir):
find "${preparedArtifacts}" -type d -printf '%P\n' | sed '/^$/d' | xargs -d '\n' -n 100 -P 10 mkdir -p
find "${preparedArtifacts}" -type f -printf '%P\n' | sed '/^$/d' | xargs -P 100 -d '\n' -I '##{}##' ln -fs "${preparedArtifacts}/##{}##" "##{}##"
Some .d
files in the root of the build-target could not be overwritten, but others were overwritten...
I have found a way to make the code a little bit more concise and performant (symlinking of the amount of files can take some time, not much, but if it can be faster it should be faster :)) and now every build target and profile should be considered.
@dpc, can you try the latest commit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the spirit of going as faster to be faster, we could even invoke mkdir -p
on all directories with no child directories so we can do it in fewer spawned processes?
find "${preparedArtifacts}" -type d -links 2 ! -empty -printf '%P\n' | sed '/^$/d' | xargs -d '\n' -n 100 -P 10 mkdir -p
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Philipp-M Everything failing on c415996
…stead of just 'release'
@Philipp-M I pushed a small fix to resolve the build errors. Looks like a lot of the tests are passing but some are still failing (perhaps we shouldn't link build script output directories since scripts might need to be re-run and they expect to be able to write files? Alternatively perhaps we can take a shortcut by linking any |
Thanks.
@dpc, can you provide more info, what exactly is going wrong (like hardware/OS, the exact issue)? |
Just noticed #344 (thanks for further investigating btw.), I guess I can close this for now? |
@Philipp-M It's up to you! I decided to pick the low hanging fruit so we could make incremental updates. Feel free to explore the .so linking optimization in this PR or a fresh one as you prefer |
I tried the current master with #334 and just so you're aware: in a larger project, just using zstd is still waaay better than not using, even with current optimizations. In our our project zstd produces: 828M for deps, and 1.5G for workspace build. no-zstd&symlinks: 4.6G and 4.1G. |
@dpc fwiw this PR's symlinking only happens when inheriting existing artifacts to perform a build, but is entirely unrelated with how the resulting artifacts are organized in |
Motivation
Avoid copying big cargo build artifacts from the nix store, when using
cargoArtifacts
.This works by just symlinking everything in
target/release/build
andtarget/release/deps
. and uses rsync to copy other files. AFAIK the big files are all contained in these folders and are content-addressed by hash, so this should be safe.What I have noticed (even before this change), is that cargo tarpaulin generally seems to rebuild the cargo artifacts, ignoring previous artifacts, even with
--skip-clean
, this fails as it tries to regenerate already built artifacts and tries to write into the build folders.This should probably be investigated, so that cargo tarpaulin reuses the previously built/cached dependencies.
I have for now just removed
--skip-clean
in cargo tarpaulin, as it doesn't seem to have the intended effect anyway?Without that flag, all tests run through.
Btw. I have noticed the (now removed) comment, but I have not experienced any issues other than the cargo tarpaulin above.
Checklist
docs/API.md
(or general documentation) with changesCHANGELOG.md