Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-115382: Fix cross compiles when host and target use same SOABI #116294

Merged
merged 3 commits into from
Oct 16, 2024

Conversation

vfazio
Copy link
Contributor

@vfazio vfazio commented Mar 4, 2024

Previously, when a build was configured to use a host interpreter via --with-build-python, the PYTHON_FOR_BUILD config value included a path in PYTHONPATH that pointed to the target's built external modules.

For "normal" foreign architecture cross compiles, when loading compiled external libraries, the target libraries were processed first due to their precedence in sys.path. These libraries were then ruled out due to a mismatch in the SOABI so the import mechanism continued searching until it found the host's native modules.

However, if the host interpreter and the target python were on the same version + SOABI combination, the host interpreter would attempt to load the target's external modules due to their precedence in sys.path.

Despite the "match", the target build may have been linked against a different libc or may include unsupported instructions so loading or executing the target's external modules can lead to crashes.

Now, the path to the target's external modules is no longer defined in PYTHONPATH to prevent accidentally loading these foreign modules.

One caveat is that during certain build stages, the target's sysconfig module requires higher precedence than the host's version in order to accurately query the target build's configuration.

This worked previously due to the target's sysconfig data module having precedence over the host's (see above). In order to keep this desired behavior, a new environment variable, _PYTHON_SYSCONFIGDATA_PATH, has been defined so sysconfig can search this directory for the target's sysconfig data.

@vfazio
Copy link
Contributor Author

vfazio commented Mar 19, 2024

Note, I did not convert some of the other places that seem to replicate the PYTHONPATH pattern, specifically the wasm paths:

"--env PYTHONPATH=/{relbuilddir}/build/lib.wasi-wasm32-{version}:/Lib"

export HOSTRUNNER="wasmtime run --mapdir /::$(pwd) --env PYTHONPATH=/builddir/wasi/build/lib.wasi-wasm32-$PYTHON_VERSION $(pwd)/builddir/wasi/python.wasm --"

sysconfig_data = f"{wasi_build_dir}/build/lib.wasi-wasm32-{python_version}"

[WASI/*], [HOSTRUNNER='wasmtime run --wasm max-wasm-stack=8388608 --wasi preview2 --env PYTHONPATH=/$(shell realpath --relative-to $(abs_srcdir) $(abs_builddir))/$(shell cat pybuilddir.txt):/Lib --dir $(srcdir)::/'],

I expect those should still work as is, though they could certainly change to using _PYTHON_SYSCONFIGDATA_PATH if their intent is purely to import sysconfig and not any other modules that may be in that directory.

Previously, when a build was configured to use a host interpreter via
--with-build-python, the PYTHON_FOR_BUILD config value included a path
in PYTHONPATH that pointed to the target's built external modules.

For "normal" foreign architecture cross compiles, when loading compiled
external libraries, the target libraries were processed first due to
their precedence in sys.path. These libraries were then ruled out due to
a mismatch in the SOABI so the import mechanism continued searching
until it found the host's native modules.

However, if the host interpreter and the target python were on the same
version + SOABI combination, the host interpreter would attempt to load
the target's external modules due to their precedence in sys.path.

Despite the "match", the target build may have been linked against a
different libc or may include unsupported instructions so loading or
executing the target's external modules can lead to crashes.

Now, the path to the target's external modules is no longer defined in
PYTHONPATH to prevent accidentally loading these foreign modules.

One caveat is that during certain build stages, the target's sysconfig
module requires higher precedence than the host's version in order to
accurately query the target build's configuration.

This worked previously due to the target's sysconfig data module having
precedence over the host's (see above). In order to keep this desired
behavior, a new environment variable, _PYTHON_SYSCONFIGDATA_PATH, has
been defined so sysconfig can search this directory for the target's
sysconfig data.

Signed-off-by: Vincent Fazio <[email protected]>
@vfazio vfazio force-pushed the vfazio-fix-cross-compile-main branch from 3e51689 to 68e931a Compare April 2, 2024 14:45
@vfazio
Copy link
Contributor Author

vfazio commented Apr 2, 2024

@FFY00 @erlend-aasland @corona10 please let me know if there is any other information I can provide to help move this along.

@erlend-aasland
Copy link
Contributor

@vfazio, I don't have the bandwidth to review this any time soon :( Hopefully in a few weeks.

arnout pushed a commit to buildroot/buildroot that referenced this pull request Apr 9, 2024
…OABI

When python performs a cross compile, it uses a host interpreter to run
steps on behalf of the foreign architecture to finalize the build.

When performing these steps, foreign modules may be loaded if the SOABI
matches that of the host. This can lead to issues if the modules are
linked against a libc not available on the host or if the binaries
include instructions unsupported by the host.

For now, patch the foreign libraries out of PYTHONPATH and explicitly
define the path to sysconfigdata so builds can complete without error.

This method currently passes all upstream CI pipelines [0] and should
also work (with some modifications) for the migration to 3.12 [1].

Fixes: http://autobuild.buildroot.net/results/c854080e003e9a7d525325073190b472a8f982aa/
[0]: python/cpython#116294
[1]: https://lists.buildroot.org/pipermail/buildroot/2024-February/685369.html

Signed-off-by: Vincent Fazio <[email protected]>
Tested-by: Yann E. MORIN <[email protected]>
Signed-off-by: Yann E. MORIN <[email protected]>
@yann-morin-1998
Copy link

Tested-by: Yann E. MORIN <[email protected]>

@bkuhls
Copy link

bkuhls commented Apr 13, 2024

Tested-by: Bernd Kuhls <[email protected]>

Copy link
Member

@FFY00 FFY00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vfazio, thank you so much for working on this, and sorry for the delay it took to get the PR reviewed. Also, thanks to everyone who tested it.

The proposed solution seems okay, I am just gonna trigger the tests to run on the buildbots to make sure nothing is unexpectedly broken there before merging.

@FFY00 FFY00 added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Apr 13, 2024
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @FFY00 for commit 68e931a 🤖

If you want to schedule another build, you need to add the 🔨 test-with-buildbots label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Apr 13, 2024
@vfazio
Copy link
Contributor Author

vfazio commented Apr 13, 2024

@vfazio, thank you so much for working on this, and sorry for the delay it took to get the PR reviewed. Also, thanks to everyone who tested it.

The proposed solution seems okay, I am just gonna trigger the tests to run on the buildbots to make sure nothing is unexpectedly broken there before merging.

@FFY00 Thanks for taking a look and kicking this off!

If this gets accepted, is there any chance i could follow it up with a patch for 3.12 or other earlier versions?

At the time of writing, one test is failing due to a disk space error and another due to PEG generator errors which I cant imagine are related to my PR.

@vfazio
Copy link
Contributor Author

vfazio commented Apr 14, 2024

I don't think any of the failures are related to this MR. Im not sure if any of the buildbots attempt to cross compile, it may be worth adding a couple of those scenarios at some point in the future.

@vfazio
Copy link
Contributor Author

vfazio commented Apr 16, 2024

@FFY00 Do you want me to rebase this PR on main or can this be merged as is?

@vfazio
Copy link
Contributor Author

vfazio commented Apr 28, 2024

@FFY00 @erlend-aasland @corona10 please let me know if there is any other information I can provide to help move this along.

Apologies in advance for the bump, but I just want to make sure there's no additional followup necessary from me on this PR.

arnout pushed a commit to buildroot/buildroot that referenced this pull request May 6, 2024
…OABI

When python performs a cross compile, it uses a host interpreter to run
steps on behalf of the foreign architecture to finalize the build.

When performing these steps, foreign modules may be loaded if the SOABI
matches that of the host. This can lead to issues if the modules are
linked against a libc not available on the host or if the binaries
include instructions unsupported by the host.

For now, patch the foreign libraries out of PYTHONPATH and explicitly
define the path to sysconfigdata so builds can complete without error.

This method currently passes all upstream CI pipelines [0] and should
also work (with some modifications) for the migration to 3.12 [1].

Fixes: http://autobuild.buildroot.net/results/c854080e003e9a7d525325073190b472a8f982aa/
[0]: python/cpython#116294
[1]: https://lists.buildroot.org/pipermail/buildroot/2024-February/685369.html

Signed-off-by: Vincent Fazio <[email protected]>
Tested-by: Yann E. MORIN <[email protected]>
Signed-off-by: Yann E. MORIN <[email protected]>
(cherry picked from commit d01e13c)
Signed-off-by: Peter Korsgaard <[email protected]>
@vfazio
Copy link
Contributor Author

vfazio commented May 14, 2024

@FFY00 @erlend-aasland @corona10 just wanted to reach out again on this PR and see if there are changes necessary and if there's a chance of getting it merged and ported to 3.12/3.13. It will require minor tweaks to target 3.12 I think.

@erlend-aasland
Copy link
Contributor

@vfazio: I'll review this at the PyCon sprints next week! Thanks for your patience.

@vfazio
Copy link
Contributor Author

vfazio commented May 14, 2024

@vfazio: I'll review this at the PyCon sprints next week! Thanks for your patience.

Thanks, I realize you're all busy people so I apologize for the pings, I just want to make sure there's nothing blocking on me.

@vfazio
Copy link
Contributor Author

vfazio commented Jun 23, 2024

@erlend-aasland should i rebase this PR occasionally?

@vfazio
Copy link
Contributor Author

vfazio commented Aug 8, 2024

i'm using spec = FileFinder(path, (SourceFileLoader, SOURCE_SUFFIXES)).find_spec(name) but i wonder if it would be better to use importlib.util.spec_from_file_location ?

Some of this machinery is still opaque to me, when to use one or the other. I've noticed that the spec generated from FileFinder seems to behave subtly differently than spec generated from spec_from_file_location in other development i've done... I'd prefer to use the recommended method for this solution.

@vfazio
Copy link
Contributor Author

vfazio commented Aug 9, 2024

@FFY00 @erlend-aasland @corona10 friendly ping

1 similar comment
@vfazio
Copy link
Contributor Author

vfazio commented Oct 15, 2024

@FFY00 @erlend-aasland @corona10 friendly ping

@FFY00
Copy link
Member

FFY00 commented Oct 16, 2024

Sorry for the delay! GIthub notifications are out of hand and it's difficult to keep track of everything.

@FFY00 FFY00 merged commit aecbc2e into python:main Oct 16, 2024
45 checks passed
@vfazio
Copy link
Contributor Author

vfazio commented Oct 16, 2024

Sorry for the delay! GIthub notifications are out of hand and it's difficult to keep track of everything.

I totally understand. Thanks again for your help with this, I sincerely appreciate it!

Do you think there's a chance of porting this back to 3.13?

@FFY00
Copy link
Member

FFY00 commented Oct 17, 2024

Personally, I am fine with it, though I would like the release manager to approve it since it introduces a behavior change when the _PYTHON_SYSCONFIGDATA_PATH environment variable is specified.

@Yhg1s what do you think?

@Yhg1s
Copy link
Member

Yhg1s commented Oct 18, 2024

Yeah, I'm okay with this being backported. It seems like a reasonable bugfix and it shouldn't affect anything else. (Anything already setting _PYTHON_SYSCONFIGDATA_* environment variables should realise they're explicitly meddling with Python sysconfig internals ;p)

@vfazio
Copy link
Contributor Author

vfazio commented Oct 19, 2024

Yeah, I'm okay with this being backported. It seems like a reasonable bugfix and it shouldn't affect anything else. (Anything already setting _PYTHON_SYSCONFIGDATA_* environment variables should realise they're explicitly meddling with Python sysconfig internals ;p)

The last time i backported something, miss Islington did the grunt work. Im not sure if it ports back cleanly, im out of town so wont be able to check for a couple of weeks.

@FFY00 FFY00 added the needs backport to 3.13 bugs and security fixes label Oct 20, 2024
@miss-islington-app
Copy link

Thanks @vfazio for the PR, and @FFY00 for merging it 🌮🎉.. I'm working now to backport this PR to: 3.13.
🐍🍒⛏🤖

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs backport to 3.13 bugs and security fixes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants