6

bash: 4.3.42(1)-release (x86_64-pc-linux-gnu)

Executing the following script:

# This is myscript.sh
line=$(ps aux | grep [m]yscript)  # A => returns two duplicates processes (why?)
echo "'$line'"
ps aux | grep [m]yscript          # B => returns only one

Output:

'tom   31836  0.0  0.0  17656  3132 pts/25   S+   10:33   0:00 bash myscript.sh
tom   31837  0.0  0.0  17660  1736 pts/25   S+   10:33   0:00 bash myscript.sh'
tom   31836  0.0  0.0  17660  3428 pts/25   S+   10:33   0:00 bash myscript.sh

Why does the inline executed ps-snippet (A) return two lines?

tokosh
  • 1,772
  • 3
  • 20
  • 37

2 Answers2

3

Summary

This creates a subshell and hence two processes are running:

line=$(ps aux | grep [m]yscript) 

This does not create a subshell. So, myscript.sh has only one process running:

ps aux | grep [m]yscript       

Demonstration

Let's modify the script slightly so that the process and subprocess PIDs are saved in the variable line:

$ cat myscript.sh 
# This is myscript.sh
line=$(ps aux | grep [m]yscript; echo $$ $BASHPID)
echo "'$line'"
ps aux | grep [m]yscript  

In a bash script, $$ is the PID of the script and is unchanged in subshells. By contrast, when a subshell is entered, bash updates $BASHPID with the PID of the subshell.

Here is the output:

$ bash myscript.sh 
'john1024  30226  0.0  0.0  13280  2884 pts/22   S+   18:50   0:00 bash myscript.sh
john1024   30227  0.0  0.0  13284  1824 pts/22   S+   18:50   0:00 bash myscript.sh
30226 30227'
john1024   30226  0.0  0.0  13284  3196 pts/22   S+   18:50   0:00 bash myscript.sh

In this case, 30226 is the PID on the main script and 30227 is the PID of the subshell running ps aux | grep [m]yscript.

John1024
  • 109,961
  • 14
  • 137
  • 171
3
  • a command substitution ($(...))
  • each segment of a pipeline[1]

cause Bash to create a subshell (a child process created by forking the current shell process), but then Bash optimizes away subshells if they result in a single call to an external utility.

(What I think is happening in the optimization scenario is that a subshell is actually created but then instantly replaced by the external utility's process, via something like exec. Do let me know if you know for sure.)

Applied to your example:

  • line=$(ps aux | grep [m]yscript) creates 3 child processes:

    • 1 subshell - the fork of your script you see as an additional match returned by grep.
    • 2 child processes (1 for each pipeline segment) - ps and grep; they take the place of the optimized-away subshells; their parent process is the 1 remaining subshell created by the command substitution.
  • ps aux | grep [m]yscript creates 2 child processes (1 for each pipeline segment):

    • ps and grep; they take the place of the optimized-away subshells; their parent process is the current shell.

For an overview of the scenarios in which a subshell is created in Bash, see this answer of mine, which, however, doesn't cover the optimizing-away scenarios.


[1] In Bash v4.2+ you can set option lastpipe (off by default) in order to make the last pipeline segment run in the current shell instead of a subshell; aside from a slight efficiency gain, this allows you to declare variables in the last segment that the current shell can see after the pipeline exits.

Community
  • 1
  • 1
mklement0
  • 382,024
  • 64
  • 607
  • 775
  • I gave the check to the other answer because it was faster. But this answer is good too. Thanks – tokosh Jun 14 '16 at 02:34
  • I appreciate the feedback and the up-vote, tokosh. (I wrote this answer _after_ @John1024's, because, even though his answer is helpful and correct, I felt it lacked background information that can be helpful in related scenarios - you definitely _can_ end up with (non-ephemeral) subshells created by pipeline segments.) – mklement0 Jun 14 '16 at 02:44
  • Ok, I have to admit the "because faster" is a bit of a stupid argument. But, well I leave it as it is (both of you deserve it). So, +1 (can't do more) for the linked answer. – tokosh Jun 14 '16 at 03:25