-
-
Notifications
You must be signed in to change notification settings - Fork 330
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gix-command
on Windows runs shell commands in non-POSIX mode
#1868
Comments
It now prefers the `(git root)/bin/sh.exe` shim, falling back to the `(git root)/usr/bin/sh.exe` non-shim to support the Git for Windows SDK which doesn't have the shim. The reason to prefer the shim is that it sets environment variables, including prepending `bin` directories that provide tools one would expect to have when using it. Without this, common POSIX commands may be unavailable, or different and incompatible implementations of them may be found. In particular, if they are found in a different MSYS2 installation whose `msys-2.0.dll` is of a different version or otherwise a different build, then calling them directly may produce strange behavior. See: - https://cygwin.com/faq.html#faq.using.multiple-copies - GitoxideLabs#1862 (comment) Overall this makes things more robust than either preferring the non-shim or just doing a path search for `sh` as was done before that. But it exacerbates GitoxideLabs#1868 (as described there), so if the Git for Windows `sh.exe` shim continues to work as it currently does, then further improvements may be called for here.
This makes a few changes to make `shell()` more robust: 1. Check the last two components of the path `git --exec-path` gave, to make sure they are `libexec/git-core`. (The check is done in such a way that the separator may be `/` or `\`, though a `\` separator here would be unexpected. We permit it because it may plausibly be present due to an overriden `GIT_EXEC_PATH` that breaks with Git's own behavior of using `/` but that is otherwise fully usable.) If the directory is not named `git-core`, or it is a top-level directory (no parent), or its parent is not named `libexec`, then it is not reasonable to guess that this is in a directory where it would be safe to use `sh.exe` in the expected relative location. (Even if safe, such a layout does not suggest that a `sh.exe` found in it would be better choice than the fallback of just doing a `PATH` search.) 2. Check the grandparent component (that `../..` would go to) of the path `git --exec-path` gave, to make sure it is recognized name of a platform-specific `usr`-like directory that has been used in MSYS2. This is to avoid traversing up out of less common directory trees that have some different and shallower structure than found in a typical Git for Windows or MSYS2 installation. 3. Instead of using only the `(git root)/usr/bin/sh.exe` non-shim, prefer the `(git root)/bin/sh.exe` shim. If that is not found, fall back to the `(git root)/usr/bin/sh.exe` non-shim, mainly to support the Git for Windows SDK, which doesn't have the shim. The reason to prefer the shim is that it sets environment variables, including prepending `bin` directories that provide tools one would expect to have when using it. Without this, common POSIX commands may be unavailable, or different and incompatible implementations of them may be found. In particular, if they are found in a different MSYS2 installation whose `msys-2.0.dll` is of a different version or otherwise a different build, then calling them directly may produce strange behavior. See: - https://cygwin.com/faq.html#faq.using.multiple-copies - GitoxideLabs#1862 (comment) This makes things more robust overall than either preferring the non-shim or just doing a path search for `sh` as was done before that. But it exacerbates GitoxideLabs#1868 (as described there), so if the Git for Windows `sh.exe` shim continues to work as it currently does, then further improvements may be called for here.
This makes a few changes to make `shell()` more robust: 1. Check the last two components of the path `git --exec-path` gave, to make sure they are `libexec/git-core`. (The check is done in such a way that the separator may be `/` or `\`, though a `\` separator here would be unexpected. We permit it because it may plausibly be present due to an overriden `GIT_EXEC_PATH` that breaks with Git's own behavior of using `/` but that is otherwise fully usable.) If the directory is not named `git-core`, or it is a top-level directory (no parent), or its parent is not named `libexec`, then it is not reasonable to guess that this is in a directory where it would be safe to use `sh.exe` in the expected relative location. (Even if safe, such a layout does not suggest that a `sh.exe` found in it would be better choice than the fallback of just doing a `PATH` search.) 2. Check the grandparent component (that `../..` would go to) of the path `git --exec-path` gave, to make sure it is recognized name of a platform-specific `usr`-like directory that has been used in MSYS2. This is to avoid traversing up out of less common directory trees that have some different and shallower structure than found in a typical Git for Windows or MSYS2 installation. 3. Instead of using only the `(git root)/usr/bin/sh.exe` non-shim, prefer the `(git root)/bin/sh.exe` shim. If that is not found, fall back to the `(git root)/usr/bin/sh.exe` non-shim, mainly to support the Git for Windows SDK, which doesn't have the shim. The reason to prefer the shim is that it sets environment variables, including prepending `bin` directories that provide tools one would expect to have when using it. Without this, common POSIX commands may be unavailable, or different and incompatible implementations of them may be found. In particular, if they are found in a different MSYS2 installation whose `msys-2.0.dll` is of a different version or otherwise a different build, then calling them directly may produce strange behavior. See: - https://cygwin.com/faq.html#faq.using.multiple-copies - GitoxideLabs#1862 (comment) This makes things more robust overall than either preferring the non-shim or just doing a path search for `sh` as was done before that. But it exacerbates GitoxideLabs#1868 (as described there), so if the Git for Windows `sh.exe` shim continues to work as it currently does, then further improvements may be called for here.
Thanks so much for this incredible research! I have lost all faith for ever getting this right by myself 😅, and can only leave decisions on how to best tackle this to you. Personally I'd prefer correctness and execute through the shim by default if it makes anything better, and deal with performance problems later. |
Yes, correctness is more important than performance here. But the shim is the cause of the incorrect behavior described in this issue. But not using the shim, unless other steps are taken, will cause a more severe form of incorrect behavior (#1862 (comment)). That the shim causes the issue described here where the shell is wrongly not in POSIX mode might be a bug in Git for Windows. That is, this entire issue may simply be a bug in Git for Windows, as it manifests in gitoxide's interaction with Git for Windows. If so, then it could be entirely fixed by changes to Git for Windows. I would consider that outcome to be ideal. I will look into that. In contrast, some beneficial effects that we currently only get when we use the shim are needed to avoid problems that are more serious, and some of those problems are not due to bugs in other software. The specific problem I encountered when not using the shim, summarized above and detailed in #1862 (comment), is hopefully a bug that can be fixed in MSYS2, though I am not confident that it is considered a bug. But even if so, configured shell commands, hook scripts, fixture scripts, and any other shell scripts generally need to have access to common Unix tools--the "standard library" of shell scripting--such as So we should call the shim unless we can customize the environment sufficiently ourselves. Customizing the environment ourselves would likely improve performance, but that is not the main reason I am interested in eventually doing it. Rather, it would actually be more similar to what My understanding of how and when |
Thanks for the clarification, I do usually have trouble to correctly digest everything in long write-ups, and using a the 'summary' feature feels dangerous, too.
That's perfect - if I understand correctly one could safely adopt this code, avoid the shim, and become independent of any bug-fix in Git for Windows, all while avoiding potential performance issues. |
Current behavior 😯
Background
Git commands that run in a shell are meant to run in a POSIX-compatible
sh
. This can be, and usually is, a shell that is more specifically known by some other name and that extends and even breaks with the requirements of POSIX forsh
. But when run assh
, such a shell behaves in a POSIX-compatible manner. This is true ofbash
, which providessh
in Git for Windows environments. (Some shells, includingbash
, enter POSIX mode only after running commands from startup scripts. Some shells also do not behave in a completely POSIX-compatible way even when in POSIX mode. Neither of those caveats relates to this issue.)Therefore, even when
sh
andbash
are same due to being equivalent symlinks, hard links, or duplicate files, runningsh
runs a shell in POSIX mode. This is, broadly speaking, the case even on Windows: for example, whenbash.exe
andsh.exe
are shell executables with the same contents, runningbash.exe
defaults to non-POSIX mode and runningsh.exe
defaults to POSIX mode.But this is not quite true when the executable being run is not really the shell itself but instead a shim that runs the shell. Then what matters is the name that the shim runs the shell under. This is not a special rule, but just a consequence of the above: the shim, after all, is a separate program running the shell. Ordinarily, this would not be a problem. Non-shim
bash.exe
andsh.exe
shells--which could be copies, symlinks, or hard links--could be run by separate similar but nonidenticalbash.exe
andsh.exe
shim executables. In this approach, thebash.exe
shim would delegate to the non-shimbash.exe
, and thesh.exe
would delegate to the non-shimsh.exe
.The problem
The trouble is that the
(git root)\bin\bash.exe
and(git root)\bin\sh.exe
shims found in full non-SDK installations of Git for Windows (including portable installations) do not work this way. They are equivalent: both delegate to(git root)\usr\bin\bash.exe
, neither to(git root)\usr\bin\sh.exe
. At least in the Portable Git installations I tested--andscoop
installations, but that is a repackaging of Portable Git--they are separate files, and they are not hard links to the same thing, but they have identical contents:Why that makes
gix-command
behave subtly wrongWhen
git
runs shell commands, it does not use those shims, so it uses itssh
as such. That is, it invokes its non-shimsh.exe
, causing it to run in POSIX mode. But whengix-command
runs a command with a shell due touse_shell
being set totrue
, it runssh
:gitoxide/gix-command/src/lib.rs
Lines 283 to 286 in d0ef276
On systems that have a usable
sh
that can be found in such aPATH
search, that will often be the(git root)\bin\sh.exe
shim associated with Git for Windows, since many users of gitoxide on Windows--and more broadly of tools on Windows that operate on Git repositories, some of which may use gitoxide library crates--will have Git for Windows installed with that directory in theirPATH
.As described above, this shim is called
sh
but it is really a shim forbash
. It runs abash
shell calledbash
withargv[0]
set tobash
. The resulting shell instance does not enter POSIX mode, even though, from the perspective ofgix-command
, it ransh
.But we may need to use the shim
This issue was not a motivation for #1862. But as originally envisioned, that PR would have fixed this. One of its changes is to replace the above code with:
gitoxide/gix-command/src/lib.rs
Lines 283 to 284 in 65f706b
Where the implementation of
gix_path::env::shell()
is also changed, but in the original vision of #1862 was intended to continue using the non-shimsh.exe
in Git for Windows instead of the shim.Using the non-shim
sh.exe
would fix this issue. But does not seem to be a reasonable thing to do without further environment customization to account for the absence of the shim's functionality. Such customization may be possible, but I think it is beyond the scope of #1862. When not using a shim, some environment variables--includingPATH
directories with expected tools--may be absent or set to unusable values.The shim helps avoid running wrong tool executables
Such a shell may even pick up executables that link to
msys-2.0.dll
from a different MSYS2 installation from the one the shell itself uses which. Unlike most Windows programs, MSYS2 programs that use onemsys-2.0.dll
can have problems running other MSYS2 programs that use anothermsys-2.0.dll
or a different version of build, even when all executables and DLLs are in safe locations and all executables load the correct DLLs. This is documented for Cygwin.I am unsure if it is generally as much of a problem in MSYS2, which does not seem to document it as something to be concerned about. The strange error currently blocking #1862 turns out to be such a case, though it is subtler and weirder than the examples given in that FAQ entry, and it may be unknown and I think may even be considered a bug in MSYS2. I'll give full details at #1862 soon (edit: #1862 (comment)); this fragment is so that abandoning shims as a way to fix this issue is not rushed into in the future without awareness of the risks.
Expected effect of #1862
Both for the general reason about
PATH
and other environment variables, and in view of the specific problem encountered already, I think the way forward in #1862 will be to prefer the shim.Thus it will not solve the problem described in this issue, and will even somewhat exacerbate it by making
gix-command
use the Git for Windowssh.exe
shim (which is a shim forbash.exe
) if present, even when anothersh.exe
would be found in aPATH
search.Because the actual non-shim to shim change will be in
gix_path::env::shell()
, this issue will also be exacerbated in the sense that it will apply to any other uses ofshell()
that do not take steps to mitigate it (such as those suggested below).It seems to me that this issue is much less severe than the problems of having an insufficient or malfunctioning environment, and that it is justified to exacerbate this issue in that way. But #1862 is one of my motivations for opening this, so that it is known.
Expected behavior 🤔
When
gix-command
usessh
from Git for Windows, it should behave assh
does in Git for Windows whengit
runs it, running it in POSIX mode assh
does. See "Git behavior" and "Steps to reproduce" below for a verification of the difference and a demonstration of how they currently behave differently.Possible solutions I don't think will work well
It would be nice to have an executable that, when run, defaults to running the shell in POSIX mode.
Setting an environment variable like
POSIXLY_CORRECT
should be avoided here, since it would be inherited by non-subshell child processes of the shell and potentially affect their behavior.We can probably pass
-o posix
on Windows. I am not sure if there are any major problems of this, but I think there are some notable problems:shell_program()
. But then a value ofsh
forshell_program()
causes what would already have behaved assh
not to behave likesh
, which is extremely unintuitive. If it is special-cased to include values likesh
passed toshell_program()
then we suffer the opposite but comparably bad effect of not usingsh
like it works when it is run straightforwardly.sh.exe
shim associated with Git for Windows, not any othersh.exe
. This will complicate the implementation, and potentially result in greater coupling of implementation details betweengix-path
andgix-command
, since whether-o posix
is to be passed ingix-command
would be determined by information obtained ingix-path
.sh.exe
shims that are shims for that shim to still have the undesirable behavior. For example, whengit
is installed throughscoop
, ash.exe
is placed in abin
directory in thePATH
that is a scoop shim for the Git for Windowssh.exe
shim that is actually a shim for itsbash.exe
.So I would like to avoid that approach if possible. This leaves two other clear alternatives, and maybe others I haven't thought of.
Possible solutions I think may work well
First, maybe this is just a bug in Git for Windows. If not, then it is presumably due to an unfortunate circumstance such that having the shim one would intuitively expect would not do the right thing, which would be useful to know about because Git for Windows should probably document that somewhere (such as in its wiki) and since the underlying cause might potentially apply to gitoxide in some way.
If it is a bug in Git for Windows, then fixing it there would also fix it here. I believe that, unlike some other installations of
git
, it is rare (and inadvisable) to continue using very old versions of Git for Windows, since as far as I know there are no further-downstream builds analogous to those in operating system distributions like Debian that (roughly speaking) fix security bugs while leaving non-security bugs alone.If it is not a bug in Git for Windows, then we can try to run the non-shim executable and do our own environment modifications. Due to how process creation on Windows is slower than on Unix-like operating systems, running all commands through a shim should perhaps be avoided anyway. But whether done to fix this issue or for performance (or greater versatility), I think that is something that would be easy to get wrong and should be done very carefully. In particular, every version of Git for Windows has a chance to ship shims that work differently to account for changes in other parts of Git for Windows. In contrast, gitoxide has no such versioned coupling to the Git for Windows shims.
Git behavior
As noted above,
git
runssh
in such a way that, from the shell's perspective, it is really run assh
as behaves as such.This can be demonstrated on Windows, in PowerShell. First, I ran these commands to create
git
repository that will display information that distinguishes both the status and some of the effects of POSIX mode for the shell that runs it, when a fetch operation is performed in the repository:The command is somewhat complicated by the
ps
in Git for Windows not supporting PID arguments for filtering, and also by the inability to embed newlines in the command without changing its interpretation (even though that would work in various related cases).Then I ran
git fetch
, which printed this, where the error message is itself no problem (my custom SSH command does not attempt to actually be usable for fetching):The relevant part is:
This shows that the running shell process is observed by other MSYS2 processes as
/usr/bin/sh
, that it is abash
shell that sees its ownargv[0]
as/usr/bin/sh
, that it is in POSIX mode (the trailingposix
field in the value ofSHELLOPTS
), and that it exhibits POSIX styleexport -p
output (that the variable isALLUSERSPROFILE
, and its value, are not important here).Steps to reproduce 🕹
The difference is demonstrated by running
gix fetch
in the same repository created above to demonstrate the Git behavior. The output was:The relevant part is:
This shows that the running shell process is observed by other MSYS2 processes as
/usr/bin/bash
, that it is abash
shell that sees its ownargv[0]
as/usr/bin/bash
, that it is not in POSIX mode (noposix
field in the value ofSHELLOPTS
), and that it accordingly does not exhibit POSIX-styleexport -p
output (as above, the variable and value aren't affected by whether it's in POSIX mode, only which format it uses).The text was updated successfully, but these errors were encountered: