-
-
Notifications
You must be signed in to change notification settings - Fork 30.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-116017: Get rid of _COLD_EXIT
#120960
GH-116017: Get rid of _COLD_EXIT
#120960
Conversation
Maybe a question for @mdboom instead of you: why are the figures on the time plot different than the figures given in geometric mean? E.g. darwin arm64 says "Geometric mean: 1.00x faster" but the figure shows 1.027x" or roughly 3%? |
I think the number in the README overview of the results is the "HPT" calculation, which (I believe) weights by "noise". You can see that those particular results were pulled upwards by the startup benchmarks, which have a higher standard deviation than most of the others. But yeah, @mdboom can probably explain better. |
The plot takes the mean across all of the distributions of all of the benchmarks. So it takes into account that the distributions of all of the benchmarks aren't necessarily normally distributed. The geometric mean value (from pyperf) is the nth root of multiplying all of the means of all the benchmarks together. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One minor nit, otherwise it looks good.
The cold exits use quite a bit of memory on JIT builds, and complicate memory allocation improvements (since they are the only immortal, shared-by-all-interpreters, unlinked, un-invalidatable executors). They also introduce a very subtle back-and-forth dance between executors when warming up side exits, and represent a significant proportion of the uops executed.
This:
_EXIT_TRACE
instruction (previously this logic was shared between theexit_to_trace
label (only used by_EXIT_TRACE
) and_COLD_EXIT
..._EXIT_TRACE
to use anoparg
instead of anexit_index
...Separately, I added some missing stats to
_DYNAMIC_EXIT
.This results in a 34.7% reduction in traces executed, and a 1.6% reduction in uops executed (both corresponding to 3.7 billion fewer
_COLD_EXITS
). Per platform:aarch64-apple-darwin
: 3% faster, 9% less memoryaarch64-unknown-linux-gnu
: 9% faster, 2% less memoryi686-pc-windows-msvc
: 3% fasterx86_64-pc-windows-msvc
: 3% fasterx86_64-unknown-linux-gnu
: 1% faster, 2% less memory