3

Now that Ubuntu 22.04 is released I did a clean install on one of our jenkins-workers to test it but I can't get the docker ssh-agent to work properly. It can no longer identify that it's running inside a container, so whenever a job is launching that uses docker I can see in the console "Jenkins-worker-X does not seem to be running inside a container", followed by the pipeline failing.

I know from before that jenkins uses cgroup information to detect whether it's running in a container, so e.g. executing cat /proc/self/cgroup in a container should result in a list of lines ending with /docker/<container-id>, which is then used by Jenkins to detect the container. However, once I installed Ubuntu 22.04 the cgroup information no longer contains the /docker/<container-id> which causes the jenkins agent to think it's running on bare metal.

Even executing the official image has the same problem, i.e. docker run jenkins/ssh-agent:jdk11 followed by docker exec <container-id> cat /proc/self/cgroup ends up with a list without the container hashes on my machine.

How do I troubleshoot this? Has something changed from Ubuntu 21.10 to 22.04 that causes this problem? Is some extra configuration necessary?

I'm running latest Ubuntu 22.04 (5.15.0-27-generic), Docker version 20.10.12, build 20.10.12-0ubuntu4.

Any help would be appreciated!

EDIT: I now realized that the same thing happens in 21.10 if you upgrade all packages to the latest version (and use the latest jenkins/ssh-agent image), so the cause might be in one of the upgraded packages

  • Is the docker version same as before? – Tony Yip May 13 '22 at 13:15
  • @TonyYip Yes, I'm running docker version 20.10.12 both before/after. The build on Ubuntu 22.04 is `20.10.12-0ubuntu4` rather than `20.10.12-0ubuntu2~21.10.1`, not sure if that makes any difference. – Sebastian Hjelm May 13 '22 at 13:30
  • Ubuntu 22.04 is using cgroup v2 instead of cgroup v1 in 20.04, I am not sure do this related to the issue, but this might be a reason – Tony Yip May 13 '22 at 13:57
  • It works as expected on Ubuntu 21.10 and I think cgroup v2 was already the default by then, at least it was installed. There might have been other changes between 21.10 and 22.04 though. – Sebastian Hjelm May 13 '22 at 14:18
  • I tried upgrading one of my other machines (Ubuntu 21.10) to the latest available version, as well as pulling the latest ssh-agent image and now it's broken there as well. So it seems to be related to some package that was updated. Not docker though, I rolled it back and it made no difference. – Sebastian Hjelm May 13 '22 at 15:30

2 Answers2

6

It turned out that the problem was related to cgroup v2 after all. It seems that when using v2 the cgroup namespace is private by default when you create a container, in my case the Jenkins agents, which caused the container id to not be available in /proc/self/cgroup.

The easy solution is to run the docker container with --cgroupns host as suggested in another question here. When I did that Jenkins could once again detect the container it's running inside.

An update was probably released for Ubuntu 21.10 switching to cgroup v2, just as I posted the question, since I could later reproduce the issue there as well.

  • Did you find a way to make it work without the `--cgroupns host` parameter? I cannot use this as it's not supported by Docker Compose (yet. https://github.com/compose-spec/compose-spec/issues/148) – Chris Sep 08 '22 at 17:57
  • 1
    @Chris No, I found no other good solution. I did a rewrite of the implementation in the docker workflow plugin to build a custom version for our servers, but I never deployed it. There is a PR you can take inspiration from if you want to go that route: https://github.com/jenkinsci/docker-workflow-plugin/pull/241 – Sebastian Hjelm Sep 09 '22 at 09:20
1

If the Jenkins container is being run with Docker Compose, you can supply the cgroup parameter mentioned in the other answer in the compose file: https://docs.docker.com/compose/compose-file/05-services/#cgroup

Alternatively, if you have control over the Docker daemon running Jenkins, you can set the default-cgroupns-mode flag in your Docker Daemon config to host. Note that this will apply to all containers on the host, though.

Chris
  • 6,914
  • 5
  • 54
  • 80