0

We have a Maven Java project that we want to run in Jenkins BlueOcean pipelines.

I followed this tutorial. The pipeline is working to execute our code. Yay!

However, every time the 'build' stage of our pipeline executes under a new Jenkins run, it re-downloads all the maven artifacts. This increases our build time considerably.

I start the 'jenkins-docker' container with:

docker container run --name jenkins-docker --rm --detach \
  --privileged --network jenkins --network-alias docker \
  --env DOCKER_TLS_CERTDIR=/certs \
  --volume jenkins-docker-certs:/certs/client \
  --volume jenkins-data:/var/jenkins_home \
  --publish 2376:2376 docker:dind

And the 'jenkins-blueocean' container with:

docker container run --name jenkins-blueocean --rm --detach \
  --network jenkins --env DOCKER_HOST=tcp://docker:2376 \
  --env DOCKER_CERT_PATH=/certs/client --env DOCKER_TLS_VERIFY=1 \
  --volume jenkins-data:/var/jenkins_home \
  --volume jenkins-docker-certs:/certs/client:ro \
  --publish 8080:8080 --publish 50000:50000 jenkinsci/blueocean

Then our Jenkinsfile pipeline is:

pipeline {
    agent {
        docker {
            image 'maven:3.6.3-jdk-8' 
            args '-v /root/.m2:/root/.m2'
        }
    }
    stages {
        stage('Build') {
            steps {
                sh 'mvn -B -DskipTests clean package'
            }
        }
        stage('Test') { 
            steps {
                sh 'mvn test' 
            }
            post {
                always {
                    junit 'target/surefire-reports/*.xml' 
                }
            }
        }
    }
}

Here, Jenkins launches a new 'maven:3.6.3-jdk-8' docker image to do the run. It's also mapping a volume to persist the .m2 directory, as I understand.

Since my 'jenkins-docker' instance isn't shutting down across builds, I'd like to have this .m2 directory persisted. Then each successive run can leverage the cache of downloaded artifacts and not spend 5 minutes re-downloading them.

Is anyone able to offer any insight to what I'm doing wrong?

Thanks in advance

Cuga
  • 17,668
  • 31
  • 111
  • 166

2 Answers2

1

Finally figured it out.

The docker instance which runs the build doesn't automatically have a user.home variable. As a result, each time the build would execute, it'd store the .m2 artifacts under a path including a ? as one of the folders. Similar to what as described in this question.

From the Jenkins tutorial, I was able to figure out that the Maven Docker images expect to download artifacts to /root/.m2. This suggests the maven docker images use /root as their home directory.

I updated my Jenkinsfile to include this directive at the top so the Maven Docker instance would use it's de-facto home directory correctly.

environment {
  JAVA_TOOL_OPTIONS = '-Duser.home=/root'
}

Then I struggled with permissions issues trying to get the volume mapping right. Per the Jenkins setup, I already had created a volume jenkins-data, and it's available on my 'host' Jenkins containers as /var/jenkins_home.

Everything finally worked when I mapped this volume directly to the /root directory in my Maven docker image. This is a departure from the Jenkins tutorial setup, which showed this as being mapped to /root/.m2.

In summary

This is my working Jenkinsfile

    environment {
      JAVA_TOOL_OPTIONS = '-Duser.home=/root'
    }
    agent {
        docker {
            image 'maven:3.6.3-jdk-8' 
            args '-v /var/jenkins_home:/root'
        }
    }
    stages {
        stage('Build') {
            steps {
                sh 'mvn -B -DskipTests clean compile'
            }
        }
        stage('Test') { 
            steps {
                sh 'mvn -X test'
                jacoco()
            }
            post {
                always {
                    junit 'target/surefire-reports/*.xml' 
                }
            }
        }
    }
Cuga
  • 17,668
  • 31
  • 111
  • 166
  • So how do you make sure that the same local repository is not used concurrently by two different builds? – J Fabian Meier Feb 26 '20 at 14:55
  • I don't understand where this concern is coming from. AFAIK, Maven doesn't have a problem with this. This ticket suggests the same, resolved 11 years ago https://issues.apache.org/jira/browse/MNG-3379 – Cuga Feb 26 '20 at 16:27
  • Also, if it was a problem, we could have Jenkins only execute one job in parallel. In which case, we'd still want to cache the .m2 directory so it doesn't continually re-download the artifacts. Making maven repository persisted would be all the more important in that case. – Cuga Feb 26 '20 at 16:50
  • Sorry, I need to downvote. I cannot recommend this approach. The ticket you mentioned is about parallelization inside _one_ build. I do not understand why you think it should be fine if two independent Maven runs modify the same SNAPSHOTs and XML files in the local repository without any synchronisation. – J Fabian Meier Feb 26 '20 at 16:54
  • You can solve you problem by allowing Jenkins at most one build at a time. But you need to see whether this is a feasible solution for you. – J Fabian Meier Feb 26 '20 at 16:55
  • I feel it important to indicate that this approach is recommended by the Jenkins-published tutorial on how to set up maven builds with java, as mentioned in the original post. I think there's also a misunderstanding of how this works. Mapping the volume is letting the repository persist after my ephemeral docker container exits. This gives it the same behavior as if it was run on a regular non-docker server. I have no intention of restricting my Jenkins workers because there's been no evidence brought forth to counter the Jenkins-recommend approach I followed here. – Cuga Feb 26 '20 at 17:05
  • Also @JFMeier I didn't mean to suggest restricting Jenkins to one executor was a recommended solution. While it would obviously work, I was using it as an example to stress the importance of mapping the volume (because in that case it would be all the more important). – Cuga Feb 26 '20 at 17:23
  • Also if you build multiple branches with the same maven coordinates you risk using an artifact from a different branch than the one you are at. Usually not a good idea. – Thorbjørn Ravn Andersen Feb 26 '20 at 21:22
  • 2
    Jenkins's official documentation says to do it this way. https://jenkins.io/doc/tutorials/build-a-java-app-with-maven/#create-your-initial-pipeline-as-a-jenkinsfile. There's been no evidence shown to contradict them. Rather than make accusations, it'd be productive if you posted a complete answer showing exactly how to run a Jenkins blue ocean pipeline build to do its work on a maven:3.6.3-jdk-8 docker image. – Cuga Feb 27 '20 at 11:43
-1

EDIT: i read the question as how to have production Jenkins maven builds avoid downloading artifacts by sharing the .m2 folder. The question is about getting a tutorial on running maven in a local docker instance to work. Not quite the same.

—-

Several solutions exist. Do not share/persist the .m2 folder, Maven is not thread safe (last time I really looked into this, might be fixed by now)

Simplest is to set up a Maven mirror repository (Nexus works well for this) and tell your docker Maven build to use that as a mirror.

Thorbjørn Ravn Andersen
  • 73,784
  • 33
  • 194
  • 347
  • This still means that you need to download all the artifacts from the mirror. – J Fabian Meier Feb 25 '20 at 15:54
  • @JFMeier Yes. Simplest. More ingenious solutions exist. This is one of the ways Maven shows its age. – Thorbjørn Ravn Andersen Feb 25 '20 at 15:56
  • I think you can also attach a local repository to every agent and reuse it (so it is at most used by one build at a time). BTW I did not downvote your answer. – J Fabian Meier Feb 25 '20 at 16:13
  • @JFMeier Yes but you have to create it first and then it is not the simplest solution any more. Regarding downvotes - I don't really mind if the drive-by downvoters do not leave a comment about what he or she didn't like, as the information is still correct. – Thorbjørn Ravn Andersen Feb 25 '20 at 22:27
  • This should be a comment, not an answer. Citations for the claim that the .m2 repository isn't threadsafe would be appreciated. As I understand it, the .m2 cache is often used in a manner to store dependencies on build servers. The jenkins tutorial also suggests this. – Cuga Feb 26 '20 at 11:37
  • @Cuga See also https://stackoverflow.com/q/45299202/927493 . I also know from personal experience that the local repository is not thread safe. This is also quite clear if you think about it because the local repository is just a directory with files, there are no transactions. – J Fabian Meier Feb 26 '20 at 14:49
  • That comments to that SO post seems to indicate that the 2,000 separate builds happening concurrently is atypical usage, and that likely something else is afoot. – Cuga Feb 26 '20 at 16:31
  • @cuga Let's turn that upside down. What mechanisms are in place inside the Maven binary you use to ensure that multiple builds accessing the same file system are not modifying files simultaneously? – Thorbjørn Ravn Andersen Feb 26 '20 at 16:35
  • Rather than asking another question, it'd be helpful to provide an answer. My view is based on how I understand people to typically use Jenkins build servers / Maven, my personal experience with them, the tutorial published by Jenkins which recommends mapping the host .m2 directory to the container .m2 directory. – Cuga Feb 26 '20 at 16:47
  • @Cuga I would also be interested in how you answer this question. If two processes read/write on the same directories simultaneously, you have a potential concurrency problem. This is exactly what happens if you run two Maven builds with the same local repository at the same time. – J Fabian Meier Feb 26 '20 at 17:58
  • @Cuga You have completed an "create-your-initial-pipeline-as-a-jenkinsfile" tutorial for running Maven in a local docker container just for you (and your problem with making that work was the reason for this question) and you are using that as the gospel on how to create a production quality Jenkins build server? Even though they explicitly say that "Explaining the details behind this is beyond the scope of this tutorial."? Running Jenkins as yourself _one instance at a time_ explicitly avoids the race condition that is being mentioned here! – Thorbjørn Ravn Andersen Feb 26 '20 at 22:24
  • I did exactly what Jenkins recommends doing in their official tutorial to run Maven Java builds. Check it out for yourself. You're the one giving misdirection. In it, they provide a persisted volume to the inner docker container just as I'm doing. I have no intention of running just one Jenkins job at a time, that was intended as a joke. I've seen no evidence to contradict the official Jenkins docs to set it up this way. Not to mention, it's been working great. I promise to update this if there's any problems. But so far, I'm convinced it's the proper approach. – Cuga Feb 27 '20 at 11:40
  • @cuga Good luck with that then. – Thorbjørn Ravn Andersen Feb 27 '20 at 12:03