Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add PrecompileTools #18

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from
Draft

Add PrecompileTools #18

wants to merge 3 commits into from

Conversation

vchuravy
Copy link

No description provided.

@pxl-th
Copy link
Member

pxl-th commented Jul 2, 2024

Temporary removed AMDGPU to make it run on Julia 1.11 (needs update to LLVM 8 for GPUCompiler master).

You can run julia benchmark/pipeline.jl from Nerf.jl directory to run the whole pipeline.

Without PrecompileTools:

Trainer benchmark
 17.861788 seconds (30.04 M allocations: 1.492 GiB, 2.76% gc time, 74.64% compilation time: <1% of which was recompilation)
 39.232965 seconds (4.11 M allocations: 124.968 MiB, 1.58% gc time)
Renderer benchmark
 13.524578 seconds (10.00 M allocations: 336.965 MiB, 3.08% gc time, 14.66% compilation time)
 51.170871 seconds (33.40 M allocations: 776.366 MiB, 3.85% gc time)

With PrecompileTools:

Trainer benchmark
  4.884491 seconds (1.24 M allocations: 112.379 MiB, 19.31% gc time, 7.50% compilation time)
 36.660421 seconds (4.10 M allocations: 124.746 MiB, 0.70% gc time)
Renderer benchmark
 11.767535 seconds (7.21 M allocations: 196.629 MiB, 1.99% gc time, 0.43% compilation time)
 52.402654 seconds (34.08 M allocations: 794.252 MiB, 2.86% gc time)

@vchuravy
Copy link
Author

vchuravy commented Jul 3, 2024

I wonder if enable the disk cache would do something, but it is odd that the trainer benchmark still spends 70% in compilation

@pxl-th
Copy link
Member

pxl-th commented Jul 3, 2024

Oh, indeed :) Now it is faster (updated above post). Benchmark consist of:

@time trainer_benchmark(trainer, 10)
@time trainer_benchmark(trainer, 1000)

Where first line is to precompile all kernels mainly. So the time went down from ~18 seconds to ~5 seconds.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants