39

I'm experimenting with linux namespaces. Specifically the pid namespace.

I thought I'd test something out with bash but run into this problem:

unshare -p /bin/bash
bash: fork: Cannot allocate memory

Running ls from there gave a core dump. Exit is the only thing possible.

Why is it doing that?

hookenz
  • 36,432
  • 45
  • 177
  • 286

2 Answers2

59

The error is caused by the PID 1 process exits in the new namespace.

After bash start to run, bash will fork several new sub-processes to do somethings. If you run unshare without -f, bash will have the same pid as the current "unshare" process. The current "unshare" process call the unshare systemcall, create a new pid namespace, but the current "unshare" process is not in the new pid namespace. It is the desired behavior of linux kernel: process A creates a new namespace, the process A itself won't be put into the new namespace, only the sub-processes of process A will be put into the new namespace. So when you run:

unshare -p /bin/bash

The unshare process will exec /bin/bash, and /bin/bash forks several sub-processes, the first sub-process of bash will become PID 1 of the new namespace, and the subprocess will exit after it completes its job. So the PID 1 of the new namespace exits.

The PID 1 process has a special function: it should become all the orphan processes' parent process. If PID 1 process in the root namespace exits, kernel will panic. If PID 1 process in a sub namespace exits, linux kernel will call the disable_pid_allocation function, which will clean the PIDNS_HASH_ADDING flag in that namespace. When linux kernel create a new process, kernel will call alloc_pid function to allocate a PID in a namespace, and if the PIDNS_HASH_ADDING flag is not set, alloc_pid function will return a -ENOMEM error. That's why you got the "Cannot allocate memory" error.

You can resolve this issue by use the '-f' option:

unshare -fp /bin/bash

If you run unshare with '-f' option, unshare will fork a new process after it create the new pid namespace. And run /bin/bash in the new process. The new process will be the pid 1 of the new pid namespace. Then bash will also fork several sub-processes to do some jobs. As bash itself is the pid 1 of the new pid namespace, its sub-processes can exit without any problem.

yupeng0921
  • 794
  • 7
  • 6
  • 4
    To back this very helpful answer up with some man page references: [`man 2 unshare`](http://man7.org/linux/man-pages/man2/unshare.2.html) says about `CLONE_NEWPID`: _Unshare the PID namespace, so that the calling process has a new PID namespace for its children which is not shared with any previously existing process. The calling process is not moved into the new namespace. **The first child created by the calling process will have the process ID 1** and will assume the role of init(1) in the new namespace._ – nh2 Sep 19 '17 at 15:02
  • " It is the desired behavior of linux kernel: process A creates a new namespace, the process A itself won't be put into the new namespace, only the sub-processes of process A will be put into the new namespace." --- Isnt this statement Trie only for PID namespace? Wouldn't it work with other namespaces like mount? – Shabirmean Nov 29 '18 at 03:52
  • @yupeng0921 "unshare -fp /bin/bash", it worked. But after that when I do `ps -e`, I could see all the processes from the host machine. As per the explanation, I should only see the unshare process with PID 1 and some bash processes running. But this is not the case. Could you please explain? – Amit Bhaira Mar 19 '21 at 09:10
  • 1
    @AmitBhaira Your `ps` probably works by reading `/proc`, and you still have your parent namespace's `/proc` mounted. Use `--mount-proc`. – David Aug 18 '22 at 15:37
14

This does not explain why this happens, but shows how to correctly launch a shell in a new pid namespace:

Use the -f flag to fork off the shell from unshare, so that the new shell gets PID 1 in the newly created namespace:

unshare -fp /bin/bash

You probably also want to pass the --mount-proc option, so that your ps listing reflects your newly created PID namespace rather than the parent PID namespace:

unshare -fp --mount-proc /bin/bash

Now run ps:

# ps
   PID TTY          TIME CMD
 1 pts/1    00:00:00 bash
11 pts/1    00:00:00 ps
hek2mgl
  • 152,036
  • 28
  • 249
  • 266