-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JIT: Extend escape analysis to account for arrays with non-gcref elements #104906
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For arrays (and also perhaps boxes and ref classes) we ought to have some kind of size limit... possibly similar to the one we use for stackallocs.
We need to be careful we don't allocate a lot of stack for an object that might not be heavily used, as we'll pay per-call prolog zeroing costs.
This seems simple enough, the allocation is already a helper call so we can mostly just leave it as is; however we likely also don't want to propagate the local address, so that dataflow on the array leads back to the helper, which means we lose the connection of the newarr temp and the local. We can add that as some kind of extra payload on the helper perhaps. I also suspect what is in this PR doesn't always work right for R2R/NAOT, it looks like we embed the compile time handle into the array's method table slot, but then it's unlikely we ever look at this slot in cases where the array doesn't escape, so not sure if there's a test case that can be made to fail here. |
If we defer "lowering" of these helpers until lower, we may miss out on forming address modes from the local directly... we'll have something like this during optimization
and only later expose the assignment as to a local address:
so may end up needing an extra register perhaps. |
Leave the newarr helper call in place, and don't rewrite the uses to be uses of the local. Remove special handling in local morph and morph. Lower the helper to the proper stores later on. Update a few utilities to understand array base addresses may not be TYP_REF.
I think this approach is looking pretty good. Draft PR here: #111284 No changes now in morph or local morph or assertion prop. May not have the R2R embedding right yet either, but it may not actually matter until we try and support arrays with GC types. We aren't as capable as optimizing the results as I'd like, but improving on that raises deeper questions of our aliasing model, and so likely should be deferred. For example, liveness will mark all STOREINDs as modifying the GC heap, but now we have some that won't. Detecting that requires annotating the store with something like GTF_IND_TGT_NOT_HEAP, but I don't want to rely on that in liveness since it also can refer to pinned heap or unmanaged non-stack memory, so we probably want a different attribute like GTF_IND_TGT_STACK, and a separation of memory SSA tracking into heap, stack (byref), and unmanaged? For now I have plumbed through some logic in VN to recognize these new local arrays can be treated via liberal VNs, this enables constant prop through known index elements as long as there is no ambiguous heap store in between. But without VN we don't have enough breadcrumbs to model these more aggressively. I still need to look more closely at diffs between this and the above. SPMI doesn't show many, but has a few hundred missed contexts per collection, I suspect behavior may diverge there. |
Merged in the changes from #111284. |
@dotnet/jit-contrib PTAL SPMI will be fairly accurate here... kicked off another collection which should have even fewer misses. Code size increases but mostly from clr test where small fixed-sized arrays are unusually frequent.
a[0] = 3;
f();
= a[0];
|
Seeing AVs in osx crossgen2, oddly in both base and diff jits:
Will try and look at this locally |
Failing here:
Looks like this map should always be present since morph always calls Tolerating this in #111555. |
Merged the main branch to include #111555. |
Try a late dead store removal |
@AndyAyersMS Just experimented a bit with late dead stores removal by unexposing locals in the liveness and repeating to convergence: commit Before adding late dead stores removal: diffs Relative diffs:
Diffs look interesting but TP impacts are too high. Now I'm reverting the experiment commit. |
3450580
to
1915450
Compare
Positive case:
Codegen:
Negative case:
Codegen:
Benchmark on Mandelbrot:
Diff: https://www.diffchecker.com/bNP4qHdF/