-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
x64: reduce work in non-stack using leaf functions #5352
base: main
Are you sure you want to change the base?
Conversation
Here is the raw output of the benchmarking I did on my current system. It is hard to interpret the results (is it too noisy? not enough runs?) but here they are! with the default
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes to enable this are pleasantly simple (I guess because we already have the generic infra for it) -- thanks! Seeing 1.8k lines of prologues/epilogues disappear in the tests is nice.
Happy to r+ once the test failures are resolved (unfortunately the failure on CI does not seem to be very informative; just a segfault in filetests)...
cf67ec5
to
1cc2b42
Compare
Cranelift has had the ability for some time to identify leaf functions; by Cranelift's definition, a leaf function is one that knows of no other call signatures. bytecodealliance#1148 noted how it would be a good idea to avoid extra frame setup work in leaf functions and bytecodealliance#2960 implemented this for aarch64 and s390x. This improvement was not made for x64 due to some test failures. This change avoids any frame setup for non-stack-using leaf functions in x64.
1cc2b42
to
6e15742
Compare
6e15742
to
d9945d4
Compare
|| num_clobbered_callee_saves > 0 | ||
|| frame_storage_size > 0) | ||
// TODO | ||
&& !cfg!(target_os = "macos") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe my suggestion at #4469 (comment) would work? "non-leaf" could be used by default and "always" could be used on macOS.
Cranelift has had the ability for some time to identify leaf functions;
by Cranelift's definition, a leaf function is one that knows of no other
call signatures. #1148 noted how it would be a good idea to avoid extra
frame setup work in leaf functions and #2960 implemented this for
aarch64 and s390x. This improvement was not made for x64 due to some
test failures. This change avoids any frame setup for non-stack-using
leaf functions in x64.
Because this updates the generated code, there are multiple sets of tests
to update -- each of these is separated here into its own commit. I will follow
up in the comments with benchmark results.