4

I am using:

pgrep -P $$

to get the child pids of $$. But I actually want a list of grandchildren and great grandchild too.

How do I do this tho? With a regular programming language we would do that with recursion for example, but with bash? Perhaps use a bash function?

Alexander Mills
  • 90,741
  • 139
  • 482
  • 817
  • 3
    Bash functions are reentrant -- you can absolutely recurse. – Charles Duffy Sep 26 '18 at 19:32
  • 3
    BTW, for most real-world uses when you need to deal with a subset of the process tree as a unit, I would tend to use sessions, cgroups or similar OS-level functionality. – Charles Duffy Sep 26 '18 at 20:26
  • The topic of finding all the descendants of a process, particularly so they can be killed, comes up fairly often. Examples include [What's the best way to send a signal to all members of a process group?](https://stackoverflow.com/q/392022/4154375) and [ps: How can i recursively get all child process for a given pid](https://superuser.com/q/363169) (and many links from them). – pjh Sep 27 '18 at 18:42

7 Answers7

3

I've already posted an attempted solution. It's short and effective, and seems in line with the OP's question, so I'll leave it as it is. However, it has some performance and portability problems that mean it's not a good general solution. This code attempts to fix the problems:

top_pid=$1

# Make a list of all process pids and their parent pids
ps_output=$(ps -e -o pid= -o ppid=)

# Populate a sparse array mapping pids to (string) lists of child pids
children_of=()
while read -r pid ppid ; do
    [[ -n $pid && pid -ne ppid ]] && children_of[ppid]+=" $pid"
done <<< "$ps_output"

# Add children to the list of pids until all descendants are found
pids=( "$top_pid" )
unproc_idx=0    # Index of first process whose children have not been added
while (( ${#pids[@]} > unproc_idx )) ; do
    pid=${pids[unproc_idx++]}       # Get first unprocessed, and advance
    pids+=( ${children_of[pid]-} )  # Add child pids (ignore ShellCheck)
done

# Do something with the list of pids (here, just print them)
printf '%s\n' "${pids[@]}"

The basic approach of using a breadth-first search to build up the tree has been retained, but the essential information about processes is obtained with a single (POSIX-compliant) run of ps. pgrep is no longer used because it is not in POSIX and it could be run many times. Also, a very inefficient way of removing items from the queue (copy all but one element of it) has been replaced with manipulation of an index variable.

Average (real) run time is 0.050s when run on pid 0 on my oldish Linux system with around 400 processes.

I've only tested it on Linux, but it only uses Bash 3 features and POSIX-compliant features of ps so it should work on other systems too.

pjh
  • 6,388
  • 2
  • 16
  • 17
  • This looks good to me. Writing this myself I'd probably stream straight from `ps` into the `while read` loop (with a process substitution), but I can also see why it might be better not to go that route (if you're worried about internal consistency, should a blocking write from `ps` cause it to spread its inspection of the process tree over a longer period of time). – Charles Duffy Sep 27 '18 at 20:35
2

Using nothing but bash builtins (not even ps or pgrep!):

#!/usr/bin/env bash

collect_children() {
  # format of /proc/[pid]/stat file; group 1 is PID, group 2 is its parent
  stat_re='^([[:digit:]]+) [(].*[)] [[:alpha:]] ([[:digit:]]+) '

  # read process tree into a bash array
  declare -g children=( )              # map each PID to a string listing its children
  for f in /proc/[[:digit:]]*/stat; do # forcing initial digit skips /proc/net/stat
    read -r line <"$f" && [[ $line =~ $stat_re ]] || continue
    children[${BASH_REMATCH[2]}]+="${BASH_REMATCH[1]} "
  done
}

# run a fresh collection, then walk the tree
all_children_of() { collect_children; _all_children_of "$@"; }

_all_children_of() {
  local -a immediate_children
  local child
  read -r -a immediate_children <<<"${children[$1]}"
  for child in "${immediate_children[@]}"; do
    echo "$child"
    _all_children_of "$child"
  done
}

all_children_of "$@"

On my local system, time all_children_of 1 >/dev/null (invoking the function in an already-running shell) clocks in the neighborhood of 0.018s -- typically, 0.013s for the collect_children stage (the one-time action of reading the process tree), and 0.05s for the recursive walk of that tree triggered by the initial call of _all_children_of.

Prior timings were testing only the time needed for the walk, discarding the time needed for the scan.

Charles Duffy
  • 280,126
  • 43
  • 390
  • 441
  • This requires the `/proc` filesystem which is not widely available outside of Linux, though. – tripleee Sep 27 '18 at 04:29
  • Indeed -- I thought this was tagged Linux, but that appears to have been in error. – Charles Duffy Sep 27 '18 at 13:35
  • @CharlesDuffy, this is a very smart solution. My initial attempt at a solution was much slower. I've made a second attempt, which should be significantly faster than that, and more portable. I'd be very interested to hear how it performs on your system. – pjh Sep 27 '18 at 19:51
1

The code below will print the PIDs of the current process and all its descendants. It uses a Bash array as a queue to implement a breadth-first search of the process tree.

unprocessed_pids=( $$ )
while (( ${#unprocessed_pids[@]} > 0 )) ; do
    pid=${unprocessed_pids[0]}                      # Get first elem.
    echo "$pid"
    unprocessed_pids=( "${unprocessed_pids[@]:1}" ) # Remove first elem.
    unprocessed_pids+=( $(pgrep -P $pid) )          # Add child pids
done
pjh
  • 6,388
  • 2
  • 16
  • 17
1

to get the child pids of $$. But I actually want a list of grandchildren and great grandchild too.

Since I am using the following (bash 5.1.16) I thought I'd share in case its useful for others as it is pretty short:

get_all_descendants() {
  declare -n children="children_${1}"
  mapfile -t children < <(pgrep -P "${1}")
  for child in "${children[@]}"; do
    echo "${child}"
    get_all_descendants "${child}"
  done
}

Sample usage:

declare -a children < <(get_all_descendants $PPID)
for child in "${children[@]}"; do
  echo "${child}"
done

How do I do this tho? With a regular programming language we would do that with recursion for example, but with bash? Perhaps use a bash function?

As you can see with the sample above, you can recurse in bash. Variable scope in bash can sometimes be tricky, with unintended consequences.

get_all_descendants takes a single parameter, the PID to find descendants.

It declares a variable reference using the passed PID to decorate the variable name via declare -n children="children_${1}".

It then uses pgrep to obtain the children of the requested pid (${1}), using mapfile -t children to populate the referenced array children.

It then loops through the children, echo'ing the child's pid, and then recurses with each child.

It isn't the fastest solution due to all the subprocess launching (mapfile, psgrep) but it is simple and seems robust. Running on my system:

start_pid=$(bash -c 'echo $$') ; \
time get_all_descendants 0 | wc -l ; \
end_pid=$(bash -c 'echo $$') ; \
echo "Subprocesses launched: $((end_pid-start_pid))"
695

real    0m11.010s
user    0m3.183s
sys     0m7.946s
Subprocesses launched: 1396

The fastest solution you could is more code-complex, and at best would approach ps aux speeds:

start_pid=$(bash -c 'echo $$') ; \
time ps aux | wc -l ; \
end_pid=$(bash -c 'echo $$') ; \
echo "Subprocesses launched: $((end_pid-start_pid))"
695

real    0m0.043s
user    0m0.016s
sys     0m0.031s
Subprocesses launched: 3

However, I don't need speed, and code manageability is more important for where this is being used.

James K.
  • 11
  • 1
0

Probably a simple loop would do it:

# set a value for pid here
printf 'Children of %s:\n' $pid
for child in $(pgrep -P $pid); do
    printf 'Children of %s:\n' $child
    pgrep -P $child
done
miken32
  • 42,008
  • 16
  • 111
  • 154
0

If pgrep doesn't do what you want, you can always use ps directly. Options will be somewhat platform-dependent.

ps -o ppid,pid |
awk -v pid=$$ 'BEGIN { parent[pid] = 1 }  # collect interesting parents
    { child[$2] = $1 }  # collect parents of all processes
    $1 == pid { parent[$2] = 1 }
    END { for (p in child)
        if (parent[child[p]])
          print p }'

The variable names are not orthogonal -- parent collects the processes which are pid or one of its children as keys, i.e. the "interesting" parents, and child contains the parent of each process, with the process as the key and the parent as the value.

tripleee
  • 175,061
  • 34
  • 275
  • 318
  • Huh. Trying to test this, but when I make it `ps axe -o ppid,pid` and tell the awk to start from PID 1 rather than `$$` (`-v pid=1`) I don't get anything even near my full tree. – Charles Duffy Sep 26 '18 at 19:55
  • I understood the question to mean direct descendants of the init process, not a full tree. On closer reading, I missed the recursive requirement. I'll see if I can find the time to fix this. – tripleee Sep 27 '18 at 04:30
0

I ended up doing this with node.js and bash:

 const async = require('async');
 const cp = require('child_process');

 export const getChildPids = (pid: number, cb: EVCb<Array<string>>) => {

      const pidList: Array<string> = [];

      const getMoreData = (pid: string, cb: EVCb<null>) => {

        const k = cp.spawn('bash');
        const cmd = `pgrep -P ${pid}`;
        k.stderr.pipe(process.stderr);
        k.stdin.end(cmd);
        let stdout = '';
        k.stdout.on('data', d => {
          stdout += String(d || '').trim();
        });

        k.once('exit', code => {

          if (code > 0) {
            log.warning('The following command exited with non-zero code:', code, cmd);
          }

          const list = String(stdout).split(/\s+/).map(v => String(v || '').trim()).filter(Boolean);

          if (list.length < 1) {
            return cb(null);
          }

          for (let v of list) {
            pidList.push(v);
          }

          async.eachLimit(list, 3, getMoreData, cb);

        });
      };

      getMoreData(String(pid), err => {
        cb(err, pidList);
      });

    };
Alexander Mills
  • 90,741
  • 139
  • 482
  • 817