Ticket #4549 (closed defect: fixed)

Opened 6 months ago

Last modified 6 months ago

subshell: call execl with argv[0] being set to the actual path to Bash

Reported by: akinomyoga Owned by: andrew_b
Priority: major Milestone: 4.8.32
Component: mc-core Version: master
Keywords: Cc:
Blocked By: Blocking:
Branch state: merged Votes for changeset: committed-master

Description

I wouldn't try to argue this is a bug or a defect of Midnight Commander, but the current way of starting the shell instance by execl causes a problem with the initialization of the shell variable BASH in Bash.

1. Summary

What is BASH (shell variable) and how Bash initializes it---The shell variable BASH is a special variable of Bash and is initialized by Bash. It is expected to contain the absolute path to the binary file of the current process. Bash basically determines its value based on the value of argv[0] because that is the only portable way to resolve it to the correct value. POSIX doesn't define an interface to portable determine it (other than argv[0], which depends on how exec was performed and thus is not so reliable). Depending on the operating system, one might use a specialized way to determine the path to the binary file (as discussed e.g. here), but Bash is expected to be built in a wider range of systems and will have to anyway fall back to argv[0] even if it tried to support specialized ways. When argv[0] is not an absolute path, Bash doesn't know what would be the exact path, but it tries to resolve it by searching the matching executable file in PATH.

What happens with Midnight Commander---In the current master branch, mc determines the path to the shell by the environment variable SHELL. When it starts the shell process, the fixed string "bash" is passed to execl as argv[0].

Quoted from src/subshell/common.c:421@master:

        execl (mc_global.shell->path, "bash", "-rcfile", init_file, (char *) NULL);

Then, Bash started by mc receives just bash as argv[0] and tries to determine the absolute path by searching the name in PATH. This can pick up a wrong path when the executable path bash found in PATH is different from the one specified to SHELL.

What I suggest---Could we pass the actual path as argv[0] in calling execl (as follows)?

        execl (mc_global.shell->path, mc_global.shell->path, "-rcfile", init_file, (char *) NULL);

Is there any reason that we need to pass the ambiguous "bash" to argv[0]? FYI, except for Bash and Zsh, we already specify the actual absolute path of the shell to argv[0]. If there is no real reason to specify "bash" and "zsh", "zsh" should probably be updated to mc_global.shell->path as well.


2. Investigation in codebase

The current way of passing "bash" to argv[0] has already been the case at the very initial commit in the Git repository:

Quoted from src/subshell.c:445@(initial commit):

	    execl (shell, "bash", "-rcfile", init_file, NULL);

So this has been stable for at least 26 years.

I also noticed that in the current master branch, the shells other than bash and zsh are actually called in a way I suggest for Bash:

Quoted from (master) src/subshell/common.c:434@master

        execl (mc_global.shell->path, mc_global.shell->path, (char *) NULL);

This was introduced by commit f596c916. The associated ticket is Ticket 2742. The code previously specified the fixed strings "tcsh" and "fish", but it started to specify mc_global.shell->path to argv[0]. I tried to find the discussion about the change of argv[0] in the ticket, but this change was not mentioned at all in the discussion.


3. Version information and steps to reproduce (Optional)

These are the version information and steps to reproduce. If the above description is sufficient for you, you can safely skip this section.

What version of Midnight Commander is used?

$ LC_MESSAGES=C src/mc -V
GNU Midnight Commander 4.8.31-131-g57dddea47
Built with GLib 2.78.6
Built with S-Lang 2.3.3 with terminfo database
With builtin editor
With subshell support as default
With support for background operations
With mouse support on xterm
With support for X11 events
With internationalization support
With multiple codepages support
Virtual File Systems:
 cpiofs, tarfs, sfs, extfs, ftpfs, shell
Data types:
 char: 8; int: 32; long: 64; void *: 64; size_t: 64; off_t: 64;
$ LC_MESSAGES=C src/mc --configure-options
 '--prefix=/home/murase/opt/mc/dev' 'PKG_CONFIG_PATH=/home/murase/local/lib/pkgconfig:/home/murase/local/lib64/pkgconfig'

What steps will reproduce the problem?

  1. To explicitly demonstrate the problem, one needs to prepare different versions of Bash. For example, let us here assume we build bash-5.3-alpha from the source and install it at --prefix="$HOME/.opt/bash/5.3-alpha". Then, include the path in PATH:
PATH=~/.opt/bash/5.3-alpha/bin:$PATH

Note that the login shell and the environment variable SHELL are still assumed to be the system one (such as /bin/bash). Only when bash is executed from a command line, the local version in ~/.opt/bash is used.

  1. Then one can start mc and drop in the full-screen shell mode by pressing Ctrl-o.
$ echo "$BASH"

These are optional commands to check the context:

$ echo "$BASH_VERSION"
$ cat "/proc/$$/exe" <!-- If the system supports procfs -->

What is the expected output?

For echo "$BASH", the path to the current Bash image is expected to be printed. For echo "$BASH_VERSION", the path to the current Bash version is expected to be printed. For cat "/proc/$$/exe", if the system supports it, the correct path to the current Bash image is expected to be printed.

What do you see instead?

For echo "$BASH", the path to Bash first found in PATH is printed, which is different from the output of cat "/proc/$$/exe". The other commands produce the expected outputs.


4. Background (Optional)

Here, in case one might wonder about the use case of BASH where it needs to be the correct absolute path, I explain the background in which I originally faced the problem. If you wouldn't require the use case to justify the change, you can skip this section as well.

Bash offers a mechanism called "loadable builtins", which loads a dynamic library (.so) to add a "builtin" command dynamically by using enable -f xxx.so xxx. Since the object is dynamically linked, the ABI that the dynamic library assumes needs to match the ABI of the current Bash (including the layout of the related structures and the function signatures), or otherwise, the Bash process crashes.

Bash also provides standard loadable builtins in /usr/lib/bash/* (or $prefix/lib/bash/*) when it is installed by ./configure and make install. One can attempt to find the standard loadable builtins based on the shell variable BASH by using e.g. "${BASH%/*}/../lib/bash/*". However, if BASH points to a Bash version different from the current process image, ${BASH%/*}/../lib/bash/* will pick up loadable builtins for a different Bash version. Then, the Bash process crashes on attempting

enable -f <path to shared object found via Bash> <builtin name>

I had an interactive setting that tries to use the fdflags builtin if available. This works outside Midnight Commander, but the Bash process doesn't start in Midnight Commander. It turned out that Bash actually crashed inside Midnight Commander. The crash happened on the first use of the fdflags builtin.

Attachments

0001-subshell-call-execl-with-argv-0-being-the-actual-pat.patch (1.4 KB) - added by akinomyoga 6 months ago.

Change History

comment:1 Changed 6 months ago by andrew_b

  • Status changed from new to accepted
  • Owner set to andrew_b
  • Votes for changeset set to andrew_b
  • Branch state changed from no branch to on review
  • Milestone changed from Future Releases to 4.8.32

Great, thanks!

Branch: 4549_subshell_execl_argv0
changeset:2f2e20930f62f314e9aaa382f59bb481317c455b

comment:2 Changed 6 months ago by zaytsev

  • Votes for changeset changed from andrew_b to andrew_b zyv
  • Branch state changed from on review to approved

Wow, I vote for this as a ticket of the year! Haven't seen such a well-prepared contribution in a long time.

comment:3 follow-up: ↓ 4 Changed 6 months ago by tonythed1111

Can this be fixed by using/adjusting the $BASH_LOADABLES_PATH variable? I believe that is the directories path that bash uses to locate dynamically loadable builtins. Aren't those paths absolute path names? It seems strange that Bash would use the $BASH variable to locate the builtins.

comment:4 in reply to: ↑ 3 Changed 6 months ago by akinomyoga

Thanks for replies.

Replying to tonythed1111:

Can this be fixed by using/adjusting the $BASH_LOADABLES_PATH variable?

How can a shell script determine the value of BASH_LOADABLES_PATH when BASH is unreliable?

For the original problem that I faced, it can essentially be translated to the problem with the determination of the proper value of BASH_LOADABLES_PATH. How can a script determine the appropriate path for BASH_LOADABLES_PATH? If BASH reliably contains the absolute path of the current Bash, one can set the corresponding path to BASH_LOADABLES_PATH and attempt to load a loadable builtin (and if not found, fall back to a slower implementation by external commands, etc.). However, if BASH is unreliable, the script doesn't know what value can be safely set to BASH_LOADABLES_PATH.

It should also be noted that the default value of BASH_LOADABLES_PATH set by Bash isn't determined based on the actual absolute path to the Bash binary. It is hard-coded in the following place:

config-top.h@bash/devel

#ifndef DEFAULT_LOADABLE_BUILTINS_PATH
#define DEFAULT_LOADABLE_BUILTINS_PATH \
  "/usr/local/lib/bash:/usr/lib/bash:/opt/local/lib/bash:/usr/pkg/lib/bash:/opt/pkg/lib/bash:."
#endif

Technically, a user can specify CPPFLAGS=-DDEFAULT_LOADABLE_BUILTINS_PATH="...", so that the path is directly embedded in the binary file. However, a shell program distributed to normal users wouldn't want to require its users to build their Bash from the source just for itself. The users will attribute the problem to the shell program if it's not working with the normal builds of Bash started in Midnight Commander. This seems like an unnecessary complication since this problem can be cleanly solved by allowing Bash to determine its path through argv[0].

Also, when there are multiple versions of Bash in the system, one cannot globally set BASH_LOADABLES_PATH as an environment variable.

I believe that is the directories path that bash uses to locate dynamically loadable builtins.

Yes.

Aren't those paths absolute path names?

Yes, they are supposed to be absolute paths, but that's insufficient. They are not ensured to be the absolute path corresponding to the current Bash binary. I'm not requesting just an absolute path corresponding to a random Bash binary. The correct absolute path corresponding to the current Bash binary is needed,

It seems strange that Bash would use the $BASH variable to locate the builtins.

What uses BASH to determine the location of loadable builtins is not Bash but the shell script.


The original problem.that I faced might be solved if the proper BASH_LOADABLES_PATH were told by the oracle, but the problem of wrong BASH is not solved by BASH_LOADABLES_PATH anyway.

Last edited 6 months ago by akinomyoga (previous) (diff)

comment:5 Changed 6 months ago by andrew_b

  • Status changed from accepted to testing
  • Votes for changeset changed from andrew_b zyv to committed-master
  • Resolution set to fixed
  • Branch state changed from approved to merged

comment:6 Changed 6 months ago by andrew_b

  • Status changed from testing to closed
Note: See TracTickets for help on using tickets.