Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make initialization in OnceNonZeroUsize::get_or_try_init #[cold]. #273

Merged
merged 1 commit into from
Feb 6, 2025

Conversation

briansmith
Copy link
Contributor

In typical use, the Some branch is going to be taken extremely frequently but the None branch is only going to be taken once (or a very small number of times) at startup. If get_or_try_init is in a performance-sensitive segment of code, then it is important that get_or_try_init be inlined and that the compiler understands that the Some branch is (much) more likely than the None branch. When this happens, the call site basically becomes a load followed by a conditional jump that is basically never taken; which is ideal.

When get_or_try_init is used in many places in the user's code, it is important to avoid inlining any of the None branch into the call sites.

Unfortunately, the Rust compiler is sometimes not good at recognizing that code that calls a #[cold] function unconditionally must be cold itself. So, sometimes it isn't enough to mark our f as #[cold] #[inline(never)].

Move the entire body of the None branch into a function that is marked because some post-inlining optimization passes in the compiler seem to not understand #[cold], and because we don't want any part of that branch to be in the calling code.

In typical use, the Some branch is going to be taken extremely
frequently but the None branch is only going to be taken once (or a
very small number of times) at startup. If `get_or_try_init` is in a
performance-sensitive segment of code, then it is important that
`get_or_try_init` be inlined and that the compiler understands that the
Some branch is (much) more likely than the None branch. When this
happens, the call site basically becomes a load followed by a
conditional jump that is basically never taken; which is ideal.

When `get_or_try_init` is used in many places in the user's code, it is
important to avoid inlining any of the None branch into the call sites.

Unfortunately, the Rust compiler is sometimes not good at recognizing
that code that calls a #[cold] function unconditionally must be cold
itself. So, sometimes it isn't enough to mark our f as
`#[cold] #[inline(never)]`.

Move the entire body of the None branch into a function that is marked
because some post-inlining optimization passes in the compiler seem to
not understand `#[cold]`, and because we don't want any part of that
branch to be in the calling code.
@matklad matklad merged commit d119eea into matklad:master Feb 6, 2025
1 check passed
@matklad
Copy link
Owner

matklad commented Feb 6, 2025

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants