27

I am getting the strange error below in my Jenkins pipeline

[Pipeline] withDockerContainer
acp-ci-ubuntu-test does not seem to be running inside a container
$ docker run -t -d -u 1002:1006 -u ubuntu --net=host -v /var/run/docker.sock:/var/run/docker.sock -v /home/ubuntu/.docker:/home/ubuntu/.docker -w /home/ubuntu/workspace/CD-acp-cassandra -v /home/ubuntu/workspace/CD-acp-cassandra:/home/ubuntu/workspace/CD-acp-cassandra:rw,z -v /home/ubuntu/workspace/CD-acp-cassandra@tmp:/home/ubuntu/workspace/CD-acp-cassandra@tmp:rw,z -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** quay.io/arubadevops/acp-build:ut-build cat
$ docker top 83d04d0a3a3f9785bdde3932f55dee36c079147eb655c1ee9d14f5b542f8fb44 -eo pid,comm
[Pipeline] {
[Pipeline] sh
process apparently never started in /home/ubuntu/workspace/CD-acp-cassandra@tmp/durable-70b242d1
(running Jenkins temporarily with -Dorg.jenkinsci.plugins.durabletask.BourneShellScript.LAUNCH_DIAGNOSTICS=true might make the problem clearer)
[Pipeline] }
$ docker stop --time=1 83d04d0a3a3f9785bdde3932f55dee36c079147eb655c1ee9d14f5b542f8fb44
$ docker rm -f 83d04d0a3a3f9785bdde3932f55dee36c079147eb655c1ee9d14f5b542f8fb44
[Pipeline] // withDockerContainer

The corresponding stage in Jenkins pipeline is


    stage("Build docker containers & coreupdate packages") {
        agent {
            docker {
                image "quay.io/arubadevops/acp-build:ut-build"
                label "acp-ci-ubuntu"
                args "-u ubuntu --net=host -v /var/run/docker.sock:/var/run/docker.sock -v $HOME/.docker:/home/ubuntu/.docker"
              }
          }
          steps {
              script {
                 try {
                    sh "export CI_BUILD_NUMBER=${currentBuild.number}; cd docker; ./build.sh; cd ../test; ./build.sh;"
                    ciBuildStatus="PASSED"
                 } catch (err) {
                    ciBuildStatus="FAILED"
                 }
              }
          }
      }

What could be the reasons why the process is not getting started within the docker container? Any pointers on how to debug further are also helpful.

Gino Mempin
  • 25,369
  • 29
  • 96
  • 135
Firdousi Farozan
  • 762
  • 1
  • 6
  • 14
  • Had been facing the same issue with **Inject SSH keys** option configured to connect to the containers under Manage Jenkins > Configure System. Jenkins could connect to the Docker host and spawn a container but then couldn’t connect to the container. Surprisingly enough, this has been working in another older Jenkins instance. We updated the Dockerfile to create a user with the same username as the Docker host and copy the SSH keys in the `~/.ssh` directory of the container. Then switched to the other option that says Connect with SSH or something similar to make it work. – Dibakar Aditya Oct 12 '19 at 03:40
  • I am not using key forwarding. It is working on one slave, but on other slave, it always fails with this error. – Firdousi Farozan Oct 12 '19 at 18:33
  • This looks similar https://support.cloudbees.com/hc/en-us/articles/360029374071-Build-fails-with-process-apparently-never-started-error?mobile_site=true – Dibakar Aditya Oct 13 '19 at 19:10

9 Answers9

12

This error means the Jenkins process is stuck on some command.

Some suggestions:

  • Upgrade all of your plugins and re-try.
  • Make sure you've the right number of executors and jobs aren't stuck in the queue.
  • If you're pulling the image (not your local), try adding alwaysPull true (next line to image).
  • When using agent inside stage, remove the outer agent. See: JENKINS-63449.
  • Execute org.jenkinsci.plugins.durabletask.BourneShellScript.LAUNCH_DIAGNOSTICS=true in Jenkins's Script Console to debug.
  • When the process is stuck, SSH to Jenkins VM and run docker ps to see which command is running.
  • Run docker ps -a to see the latest failed runs. In my case it tried to run cat next to custom CMD command set by container (e.g. ansible-playbook cat), which was the invalid command. The cat command is used by design. To change entrypoint, please read JENKINS-51307.
  • If your container is still running, you can login to your Docker container by docker exec -it -u0 $(docker ps -ql) bash and run ps wuax to see what's doing.
  • Try removing some global variables (could be a bug), see: parallel jobs not starting with docker workflow.
kenorb
  • 155,785
  • 88
  • 678
  • 743
9

The issue is caused by some breaking changes introduced in the Jenkins durable-task plugin v1.31.

Source:

https://issues.jenkins-ci.org/browse/JENKINS-59907 and https://github.com/jenkinsci/durable-task-plugin/blob/master/CHANGELOG.md

Solution: Upgrading the Jenkins durable-task plugin to v1.33 resolved the issue for us.

cipher0
  • 320
  • 1
  • 7
8

I had this same problem and in my case, it was related to the -u <user> arg passed to the agent. In the end, changing my pipeline to use -u root fixed the problem.


In the original post, I notice a -u ubuntu was used to run the container:

docker run -t -d -u 1002:1006 -u ubuntu ... -e ******** quay.io/arubadevops/acp-build:ut-build cat

I was also using a custom user, one I've added when building the Docker image.

agent {
  docker {
    image "app:latest"
    args "-u someuser"
    alwaysPull false
    reuseNode true
  }
}
steps {
  sh '''
    # DO STUFF
  '''
}

Starting the container locally using the same Jenkins commands works OK:

docker run -t -d -u 1000:1000 -u someuser app:image cat
docker top <hash> -eo pid,comm
docker exec -it <hash> ls  # DO STUFF

But in Jenkins, it fails with the same "process never started.." error:

$ docker run -t -d -u 1000:1000 -u someuser app:image cat
$ docker top <hash> -eo pid,comm
[Pipeline] {
[Pipeline] unstash
[Pipeline] sh
process apparently never started in /home/jenkins/agent/workspace/branch@tmp/durable-f5dfbb1c

For some reason, changing it to -u root worked.

agent {
  docker {
    image "app:latest"
    args "-u root"      # <=-----------
    alwaysPull false
    reuseNode true
  }
}
Gino Mempin
  • 25,369
  • 29
  • 96
  • 135
  • Seem to be having the exact same issue. `sh` commands not running when using custom user in docker container. Anyone know why this is the case on Jenkins? – Frank Podborski Aug 17 '22 at 09:24
  • @FrankPodborski I don't have the answer to your question. If you have your own question, please [ask a new/separate question](https://stackoverflow.com/questions/ask). It is unlikely for Jenkins experts to be browsing the comments section on answers looking for questions to answer. – Gino Mempin Aug 17 '22 at 12:08
6

If you have upgraded the durable-task plugin to 1.33 or later and it still won't work, check if there's an empty environment variable configured in your pipeline or stored in the Jenkins configuration (dashed) and remove it:

Screen capture of the Jenkins Configuration page, Global Properties section, showing an empty environment variable

andref
  • 4,460
  • 32
  • 45
4

In addition to kenorb's answer:

  • Check permissions inside the container you are running in and the Jenkins directory on the build host.

I am running custom Docker containers and after several hours of debugging, I found that after trying to execute what Jenkins was trying to execute inside the running container (by exec into the container, running echo "$(ps waux)", and executing those sh -c commands one by one). I found Jenkins couldn't create the log file inside the container due to a mismatch in UID and GID.

Gino Mempin
  • 25,369
  • 29
  • 96
  • 135
Gakio
  • 783
  • 6
  • 7
  • Same.. also hours of debugging. Turning on launch diagnostics as suggested in kenorb's answer led me to the log file issue. How did you fix it? – briantist Sep 21 '20 at 17:07
2

If you are running Jenkins inside of Docker and using a DinD container for Jenkins running Docker jobs, make sure you mount your Jenkins data volume to /var/jenkins_home in the service providing the Docker daemon. The log creation is actually being attempted by the daemon, which means the daemon container needs access to the volume with the workspace that is being operated on.

Example snippet for docker-compose.yml:

services:
  dind:
    container_name: dind-for-jenkins
    privileged: true
    image: docker:stable-dind
    volumes:
      - 'jenkins-data:/var/jenkins_home'
Leoul
  • 31
  • 3
  • This source confirms it: https://docs.cloudbees.com/docs/admin-resources/latest/plugins/docker-workflow#:~:text=For%20inside%20to,with%20the%20agent. – Ictus Oct 14 '22 at 13:58
2

This has eaten my life! I tried every imaginable solution on at least 10 SO posts, and in the end it was because my pipeline had spaces in its name. :|

So I changed "let's try scripting" with "scripts_try" and it just worked.

Eugene
  • 117,005
  • 15
  • 201
  • 306
0

Building a Jenkins job which runs within a Docker container, and ran into this same error. The version of the Durable-Task plugin is at v1.35, so that was not the issue. My issue was ... my job was trying to run a chmod -R 755 *.sh command, and the active user within the container did not have sufficient permissions to execute chmod against those files. Would have expected Jenkins to fail the job here, but launching the container using an ID which did have permissions to run the chmod command got past this error.

Dharman
  • 30,962
  • 25
  • 85
  • 135
NoobSkywalker
  • 646
  • 1
  • 5
  • 10
0

In my case the problem was related to using Kubernetes agents:

agent {
    kubernetes {
        cloud 'cloud'
        namespace 'namespace'
        yamlFile '.ci/build-pod.yaml'
    }
}

I needed to add runAsUser to pod definition, so that Jenkins assumes a user that has the required permissions within the image:

apiVersion: v1
kind: Pod
spec:
  containers:
    - name: jnlp
      image: custom-build-image:latest
      args:
        - jenkins-slave
      tty: false
      workingDir: /home/jenkins
  securityContext:
    runAsUser: 1000
yurez
  • 2,826
  • 1
  • 28
  • 22