-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Non-POD arguments passed incorrectly on ARM64 #106471
Comments
It's caused by ABI difference on ARM64. You can see more details at https://learn.microsoft.com/cpp/build/arm64-windows-abi-conventions In short, the default constructor in C++ code make it no longer considered HFA on ARM64:
The argument passing schema for HFA and regular structs are different on ARM64. On x64 there's no such concept and all structs are passed in the same way. I'm not sure whether .NET supports passing HFAs as a single argument at all. |
Thank you for the explanation. As a workaround it was possible in my case to change the default constructor to |
…get passed correctly by interop layer in arm64 dotnet/runtime#106471
…get passed correctly by interop layer in arm64 dotnet/runtime#106471
…get passed correctly by interop layer in arm64 dotnet/runtime#106471
This commit also rebuilds the x64 binaries to ensure both use the same source code after applying workaorund from dotnet/runtime#106471
This commit also rebuilds the x64 binaries to ensure both use the same source code after applying workaorund from dotnet/runtime#106471
.NET interop targets C ABI. The The calling convention details for non-POD (Plain Old Data) C++ types are often different from the calling convention details of plain C structs as you have discovered. This came up a few times before (e.g. in #12312). We should mention this in the docs. |
https://github.com/josetr/pinvoke-arm64-issue Can you take a look at it? It does seem to work even with constructors / destructors / virtual methods as long as the structure is bigger than 16 bytes Use case: we were trying to add arm support for https://github.com/mono/CppSharp |
This is not unexpected. The exact set of cases where non-POD C++ type happens to be passed the same way as POD C++ types varies between architectures. |
We have interop with C++ mentioned in the docs now: https://github.com/dotnet/docs/blob/main/docs/standard/native-interop/abi-support.md#c-1 . cc @AaronRobinsonMSFT EDIT Official doc - https://learn.microsoft.com/dotnet/standard/native-interop/abi-support#c-1 |
It's been so far (even since .NET Framework days) possible to achieve actual C++ interop by generating C# P/Invoke code following C++ ABI semantics, "lowering" it to C-compatible P/Invoke signatures. This is the approach we use in CppSharp, through which we don't need to generate generally unnecessary C wrappers. ARM64 is the first platorm where this is not actually possible. I think this is easier to explain with a simple example: struct NonPod16 { ... };
NonPod16 CreateNonPod(intptr_t b) This ends up as (in LLVM IR format for easier understanding): define void @_Z12CreateNonPodl(ptr noalias sret(%struct.NonPod16) align 8 %0, i64 noundef %1) #1 { So the return-by-value gets transformed into what LLVM calls static unsafe extern void CreateNonPod(ref S16 ret, long b); This has worked great for a long time. Now, ARM64 is what is causing issues for us. failing with the case above.
Which makes sense, it means we cannot rely on this trick, struct returns follow a separate calling convention, which is not the case on other platforms, and .NET itself has recognize this is a special return parameter to generate valid ABI code. If we move the return back as a proper return in P/Invoke, things start working: static unsafe extern S16 CreateNonPod(long b);
[StructLayout(LayoutKind.Explicit, Size = 16)]
public partial struct S16 { ... } The only issue is that there is yet another special case where there is a distinction in the ABI depending if the struct is bigger than 16 bytes or not to decide if its passed in registers or in the stack. The .NET runtime already implements the proper ABI behaviour for both cases, we just have no way to signal to P/Invoke which case it should use. We don't want or need the .NET runtime to be able to figure out from the struct layout which case it needs to use, as that would mean encoding C++ non-pod logic in the runtime. We already know what we need .NET to do, we just need a boolean attribute so we can tell P/Invoke what it needs to do. Currently we can force .NET to do the right thing with: [StructLayout(LayoutKind.Explicit, Size = 17)] But this has other potential implications and is not really correct. What we need is for P/Invoke to recognize maybe a Or maybe: static unsafe extern void CreateNonPod([SRet] ref S16 ret, long b); Could something like this be considered to allow direct ARM64 C++ interop @jkotas @AaronRobinsonMSFT ? |
I think you are lucky that you have not run into C++ calling convention issues on other architectures. For example, I would expect that C++ interop on win x86 would run into #100033 that the runtime historically must have special handling for to make managed C++ work.
I do not think we would want to introduce one-off workarounds like this. If we were to do something for C++, we would want to look into what it would take to support C++ calling conventions holistically in the runtime. |
I don't know the full details of that particular one, but we do invoke copy constructors ourselves to handle such class of issues.
That really complicates the runtime though, it means basically keeping a significant part of the backend portion of a C++ compiler to replicate all of the ABI logic. Clang already implements all that logic, we can just take advantage of it as CppSharp does. What would be the benefits of encoding C++ ABI interop into the runtime when the same can be achieved without introducing additional complicated machinery with source generators (like COM wrapper generator)? Really all we need to achieve this is a relatively simple way to signal P/Invoke what code path to take. |
I do not think you can invoke copy constructors in this case with full fidelity.
It depends on the design. |
More of a nit and getting a bit pedantic about implementation details, but this is misleading. The interop system emits IL and then passes it to the JIT. The statement above should be to inform the JIT on how to pass the argument. This approach is possible, but would then create a likely fragile system as the JIT team would need to consider all interactions in the places where this information needs to be respected, potentially impacting correct use of the target ABI . It would also bring up other questions, is it respected on all CPU/OS combinations? When do users know when to use it or not? This would make for a very complex feature to use and document expectations.
This is conceptually simpler than one would think. It would mean the JIT can know it is mode X or Y, not one off cases that require more consideration due to potential conflicts with the current target C ABI.
My guess here is this would actually need to be even more specific and require telling the JIT precisely which register to use. |
https://godbolt.org/z/rE3Mqfnrj I think this example shows pretty clearly our problem .NET, because the struct is small enough and only cares about C ABI, still thinks it can call the function in the optimized "direct" mode way, where the struct is returned / stored directly into the x0 / x1 registers However, this doesn't always apply in C++. Such direct mode optimization is ignored when the returned struct has a destructor, copy constructor, vtable, etc. It will instead fallback to indirect mode, which is what is always used when the struct is bigger than 16 bytes. We just need a way to tell .NET to force indirect mode, because we know better, and because we are using LLVM to get such accurate info. I think it's fair to ask for an option that allows us to achieve this, even if its behind runtime/src/coreclr/jit/targetarm64.cpp Line 107 in 35d2780
So we need something like
or some c++ support
|
This isn't strictly true. It depends on the C++ ABI, which differs for Windows vs Unix and sometimes per architecture as well (x64 vs Arm64 vs ...). Unix typically follows https://itanium-cxx-abi.github.io/cxx-abi for example. Likewise, any struct bigger than 16 bytes being passed by reference isn't true either. It is rather dependent on the field types, sizes, and multiple other factors. A struct containing 4 float fields is classified as an HFA (homogenous floating-point aggregate) and is allowed to be passed in register still. Similar exists for HVA (homogenous vector aggregates). Interop with C++ is also typically not recommended because the C++ ABI is "unstable". It changes with new language features, with subtle changes to your type definitions, based on "private implementation details", and many other factors. It's an incredibly complex area and best practice (for interop with any language) is to use a C wrapper so you can have a definitively stable ABI instead. |
I was only talking about my godbolt example, which targets Clang ARM64. We already handle the differences you've mentioned by using LLVM. It correctly told us what we needed to do for Mac ARM64, but we have no way of translating that accurate information to .NET so that it can do the right thing. Adding something that is similar to We also already know about the C wrapper solution, but that obviously has a performance cost, and a build step. |
If you're already willing to hardcode the fact that it needs to be passed by reference, you can just encode that in your P/Invoke signature (i.e. Anything more accurate will require taking in and supporting defining all the various relevant C++ ABI metadata pieces, which is likely a non-starter. |
We're quite aware of this but it's all irrelevant since Clang already handles all of those details for us.
This is commonly said, but we have proved with CppSharp that the C++ ABI is actually very stable. We have had no single instance of the C++ ABI changing in over a decade. In fact, the only change was in the standard library, And even if it changed, it's a non issue since all the bindings are generated and we get the accurate info for the ABI details (including vtable layouts) from LLVM and Clang. A simple re-generation and we're back to business.
This is what we use in all non-ARM64 platforms, but as I explained in my post above, this does not work for ARM64 due to a special ABI for indirect struct returns (X0 vs X8 register). As @josetr said, what we need is in essence a single line change here: runtime/src/coreclr/jit/targetarm64.cpp Line 107 in 35d2780
We need to be able to signal to P/Invoke that take the indirect code path. Right now .NET has a very good direct C++ interop story with CppSharp, and is used a lot by the game dev community. All we request is for some consideration to be made here to the needs of C++ interop in P/Invoke to keep this good support going for ARM64. |
This needs to be clarified to note that it has shown as stable in your particular scenario. The broader ecosystem for the language as a whole is known to have issues and where those come into play. In the case of the STL, all three major C++ runtime implementations have a set of known pending ABI breaks that they need to take at some point in the future. In some cases these are necessary to support newer language versions or ensure they are standards compliant. All three major implementations do try and keep breaks down and keep things semi stable, any major C++ library would ideally be doing the same, but that is a distinct consideration.
This isn't a single line change and isn't as simple as its portrayed. Fixing this one particular instance could have some smallish (but still a couple hundred lines total across the VM, JIT, and BCL) to ensure it worked. However, that also has risk of opening the floodgates to other similar changes/requests. In the most correct case, it involves adding a much broader set of attributes/annotations that allow users to specify what C++ concepts exist in a type including (but not limited to) inheritance, virtual members, constructors, destructors, defaulted vs non-defaulted members, and a few other things that are known to impact the ABI depending on whether or not they're defined. There's an entire range of possible solutions, but none of them are really "trivial" and leave open questions as to the broader needs, designs, concerns, and ecosystem impact. |
Notably @AaronRobinsonMSFT and @jkoritzinsky are likely the right ones to give official consideration as to right direction (if any) here. |
Description
While trying to run ImGui.NET on win-arm64 I ran into some sort of argument passing issue from managed to native code when calling https://github.com/cimgui/cimgui/blob/35a4e8f8932c6395156ffacee288b9c30e50cb63/cimgui.cpp#L207 via the managed wrapper https://github.com/ImGuiNET/ImGui.NET/blob/70a87022f775025b90dbe2194e44983c79de0911/src/ImGui.NET/Generated/ImGui.gen.cs#L21374
I managed to reproduce it with following stripped down version. Note that this code runs fine in
win-x64
. It also runs fine inwin-arm64
when removing the default constructor from theVector2
struct. This makes me wonder whether it's even a dotnet issue, therefor feel free to point me to the appropriate channels if it isn't.Reproduction Steps
C#
C++
Attached is a little VS solution containing the above code
PInvoke.zip
Expected behavior
MyFunction
should receive0
asvalue
.Actual behavior
MyFunction
receives some random number asvalue
.Regression?
No response
Known Workarounds
No response
Configuration
.net8, Windows 11, ARM64
Other information
I've also tried building the project with clang but got the same result.
The text was updated successfully, but these errors were encountered: