When are bash variables exported to subshells and/or accessible by scripts?

Question

I'm confused over whether bash variables are exported to subshells and when they are accessible by scripts. My experience so far led me to believe that bash variables are automatically available to subshells. E.g.:

> FOO=bar
> echo $FOO
bar
> (echo $FOO)
bar

The above appears to demonstrate that bash variables are accessible in subshells.

Given this script:

#! /usr/bin/bash
# c.sh

func()
{
  echo before
  echo ${FOO}
  echo after
}

func

I understand that calling the script in the current shell context gives it access to the current shell's variables:

> . ./c.sh 
before
bar
after

If I were to call the script without the "dot space" precedent...

> ./c.sh 
before

after

...isn't it the case that the script is called in a subshell? If so, and it's also true that the current shell's variables are available to subshells (as I inferred from the firstmost code-block), why is $FOO not available to c.sh when run this way?

Similarly, why is $FOO also unavailable when c.sh is run within parentheses - which I understood to mean running the expression in a subshell:

> (./c.sh)
before

after

(If this doesn't muddy this post with too many questions: if "./c.sh" and "(./c.sh)" both run the script in a subshell of the current shell, what's the difference between the two ways of calling?)

A subshell is forked off the parent process, so a variable doesn't **need** to be exported to be visible in it: Child processes always inherit 100% of their parent process's state (except for the PID itself, and file descriptors which were explicitly opened with flags instructing the OS not to copy them on fork). — Charles Duffy, Aug 17 '18 at 22:57
So `./foo` **does not** run `foo` in a subshell: It's a completely unrelated child process, behind not just a `fork()` but an `execve()` boundary. — Charles Duffy, Aug 17 '18 at 22:58
...whereas `(./c.sh)` forks off a subshell, and then runs a child process from inside it, so the child process is a grandchild rather than a direct child of the original shell, and you have an `execv` boundary between the child and the grandchild (albeit none between parent and child). — Charles Duffy, Aug 17 '18 at 23:05
You tagged `shell` so I would like to point out that not all shells handle sub-shells in the same way as `bash`. Korn shell, for example, avoids creating a child process for a sub-shell. — cdarke, Aug 18 '18 at 06:57
@cdarke, ...I'd rather say that ksh implements `(...)`'s "separate environment" semantics without using subshells to the extent possible (when it becomes impossible to comply with POSIX semantics without creating a subshell, a subshell gets created; it's inaccurate to imply that `(...)` doesn't use them at all). Reading the above as a request to edit my answer to no longer state that `(...)` requests a subshell (vs requesting an independent environment most readily implemented with a subshell) is fair. — Charles Duffy, Aug 19 '18 at 13:33
@CharlesDuffy: I don't agree with your first sentence, but maybe that depends on your definition of a subshell. To me a subshell is the separate environment, you imply that a subshell is always a child. This might of course just be semantics. I should have said that ksh *tries* to avoid creating a child process. — cdarke, Aug 19 '18 at 19:59
Rereading the standard, I can definitely see some support for your definition, though not so ambiguous as to force changing my own choice of terms. Anyhow, I'm confident that reading our comments together will lead the reader to a useful understanding. :) — Charles Duffy, Aug 19 '18 at 23:52

Charles Duffy · Accepted Answer · 2021-09-13T12:34:14.837

19

(...) runs ... in a separate environment, something most easily achieved (and implemented in bash, dash, and most other POSIX-y shells) using a subshell -- which is to say, a child created by fork()ing the old shell, but not calling any execv-family function. Thus, the entire in-memory state of the parent is duplicated, including non-exported shell variables. And for a subshell, this is precisely what you typically want: just a copy of the parent shell's process image, not replaced with a new executable image and thus keeping all its state in place.

Consider (. shell-library.bash; function-from-that-library "$preexisting_non_exported_variable") as an example: Because of the parens it fork()s a subshell, but it then sources the contents of shell-library.bash directly inside that shell, without replacing the shell interpreter created by that fork() with a separate executable. This means that function-from-that-library can see non-exported functions and variables from the parent shell (which it couldn't if it were execve()'d), and is a bit faster to start up (since it doesn't need to link, load, and otherwise initialize a new shell interpreter as happens during execve() operation); but also that changes it makes to in-memory state, shell configuration, and process attributes like working directory won't modify the parent interpreter that called it (as would be the case if there were no subshell and it weren't fork()'d), so the parent shell is protected from having configuration changes made by the library that could modify its later operation.

./other-script, by contrast, runs other-script as a completely separate executable; it does not retain non-exported variables after the child shell (which is not a subshell!) has been invoked. This works as follows:

The shell calls fork() to create a child. At this point in time, the child still has even non-exported variable state copied.
The child honors any redirections (if it was ./other-script >>log.out, the child would open("log.out", O_APPEND) and then fdup() the descriptor over to 1, overwriting stdout).
The child calls execv("./other-script", {"./other-script", NULL}), instructing the operating system to replace it with a new instance of other-script. After this call succeeds, the process running under the child's PID is an entirely new program, and only exported variables survive.

edited Sep 13 '21 at 12:34

answered Aug 17 '18 at 23:03

Charles Duffy

280,126
43
390
441

2

This is fascinating - I'd never considered fork() and exec() in the context of `bash`. Is my understanding correct: when I call `(./c.sh)`, a _subshell_ is forked, therefore `$FOO` is visible in the subshell. But that subshell then `fork()`s and `exec()`s `./c.sh`, therefore within the context of `c.sh` (which is sort of a "grandchild process" of the shell where I typed "`(./c.sh)`"), `$FOO` is no longer visible ? – StoneThrow Aug 17 '18 at 23:22
1

Second sentence should read "of the *parent*", no? Not completely certain, hence not just editing ;) – Benjamin W. Aug 17 '18 at 23:27
Is my understanding of this also correct: it sounds like `./c.sh` is a "subset" of `(./c.sh)` in the sense that the former will fork-and-exec the actual `c.sh` script from the current shell context, where the latter will (1) fork but not exec (i.e. a new `bash` process is created with the same state as its parent), and then (2) fork-and-exec the actual `c.sh` script _from the newly-created `bash` child process ? – StoneThrow Aug 17 '18 at 23:30
@StoneThrow, ...yes, your understanding is correct. – Charles Duffy Aug 18 '18 at 00:02
@BenjaminW., indeed, that's what I meant, thank you. – Charles Duffy Aug 18 '18 at 00:03
3

Also, if you use `exec ./other-script` (which runs exec() *without* forking first), the other script inherits exported variables, but not non-exported shell variables. `./other-script` is mostly equivalent to `(exec ./other-script)`, in which the `( )` forks a subshell (keeping non-exported variables), and then the `exec` effectively exits the current shell (destroying non-exported variables) and runs a new shell in the same process. – Gordon Davisson Aug 18 '18 at 05:11
1

Note a `bash` peculiarity, `$$` in a subshell gives the PID of the parent, not the current subshell process. – cdarke Aug 18 '18 at 07:01
3

@cdarke: that's not a bashism; it's defined by Posix: "[`$$`] Expands to the decimal process ID of the invoked shell. In a subshell (see Shell Execution Environment ), '$' shall expand to the same value as that of the current shell." – rici Aug 18 '18 at 07:13
@rici: thanks, I had not realised it was in POSIX, but it does explain why bash does this (although you could argue about what "current shell" means). – cdarke Aug 18 '18 at 07:15
**" a child created by fork()ing the old shell, but not calling any execv-family function."** - May I ask how it's created, then? Because... `fork()`ing would create a copy of the parent process, whereas what you want to run is the child process. No? Basically, it's still not clear what the difference between a subshell and a regular child process is. Would appreciate a clarification. Thanks – Harry Sep 12 '21 at 01:36
1

@Harry, `fork()`ing creates a copy of the parent _as a child_, then `execve()` replaces that copy of the parent with the program you _want_ to have as your child (when what you want is in fact a different program; for a subshell, what you typically want _is_ just a copy of the parent with no replacement). – Charles Duffy Sep 12 '21 at 18:41
1

@Harry, ...that is to say, when you run `(echo hello)`, then you `fork()` a subshell, but because `echo` is built in, there's no `exec` needed; the `echo` command can be run directly in that subshell's original process image. (`fork()` returns a value that tells whether one is in the parent or child-shell copy of the image to allow their behaviors to diverge). – Charles Duffy Sep 12 '21 at 18:43
@CharlesDuffy **"for a subshell, what you typically want is just a copy of the parent with no replacement"** Ah, get it now. Thanks a TON. It would be helpful to others also if you could make this point part of your answer. – Harry Sep 13 '21 at 08:40
1

@Harry, I've added a paragraph -- does it succeed in making the distinction clear? – Charles Duffy Sep 13 '21 at 11:07
@CharlesDuffy I'm satisfied already but... I might rephrase para 1 as follows: `(...)` runs `...` in a separate environment, something most easily achieved (and implemented in bash, dash, and most other POSIX-y shells) using a subshell which is a child created by `fork()`ing the current shell (the one which is issuing `(...)`), **but not calling any `execv`-family function**. Thus, the entire in-memory state of the parent is duplicated, including non-exported shell variables. And for a subshell, this is precisely what you typically want: just a copy of the parent with no replacement. – Harry Sep 13 '21 at 11:53
(I mean, this is anyhow how I'm putting together in my mind all that you have said so far in your main reply and in the comments/clarifications here. So, it's only a matter of minor editing. It's okay if you don't like my version and choose to continue with yours -- that is good too.) +1 – Harry Sep 13 '21 at 11:57
1

@Harry, thanks; I largely adopted your suggested language, and also extended the new paragraph to have a more realistic example of a case where sourcing a file from a subshell is genuinely useful. – Charles Duffy Sep 13 '21 at 12:34

When are bash variables exported to subshells and/or accessible by scripts?

1 Answers1

Linked

Related