95

We use GitLab CI with shared runners to do our continuous integration. For each build, the runner downloads tons of maven artifacts.

Is there a way to configure GitLab CI to cache those artifacts so we can speed up the building process by preventing downloading the same artifact over and over again?

Stefan Birkner
  • 24,059
  • 12
  • 57
  • 72
helt
  • 4,925
  • 3
  • 35
  • 54
  • 3
    Maven has a cache usually under `$HOME/.m2/repository` or can be configured via `mvn -Dmaven.local.repo=Path`? – khmarbaise Jun 13 '16 at 09:53

8 Answers8

123

Gitlab CI allows you to define certain paths, which contain data that should be cached between builds, on a per job or build basis (see here for more details). In combination with khmarbaise's recommendation, this can be used to cache dependencies between multiple builds.

An example that caches all job dependencies in your build:

cache:
  paths:
    - .m2/repository

variables:
  MAVEN_OPTS: "-Dmaven.repo.local=$CI_PROJECT_DIR/.m2/repository"

maven_job:
  script:
    - mvn clean install
helt
  • 4,925
  • 3
  • 35
  • 54
andban
  • 1,266
  • 1
  • 10
  • 7
  • 12
    This didn't work for me until I changed `-Dmaven.repo.local=.m2` in `-Dmaven.repo.local=.m2/repository` – drakyoko Feb 04 '17 at 18:10
  • 53
    It's 2017 now and for the newcomers: `GitLab` maintains [a nice project][1] with sample configurations of their CI runner. The [sample Maven project][2] file demonstrates how to cache the maven artifacts. [1]: https://gitlab.com/gitlab-org/gitlab-ci-yml/tree/master [2]: https://gitlab.com/gitlab-org/gitlab-ci-yml/blob/master/Maven.gitlab-ci.yml – zloster Mar 28 '17 at 18:40
  • 4
    Is it possible to share cached artifacts between projects? – zygimantus Jul 04 '17 at 06:36
  • 14
    Update to @zloster links. Gitlab deprecated the project in those links. Updated link is https://gitlab.com/gitlab-org/gitlab-ce/tree/master/lib/gitlab/ci/templates – antonkronaj Oct 24 '18 at 15:23
  • 5
    Update to @antonkronaj link. Project 'gitlab-org/gitlab-ce' was moved to ['gitlab-org/gitlab-foss'](https://gitlab.com/gitlab-org/gitlab-foss/-/tree/master/lib/gitlab/ci/templates). – Jason Law May 13 '21 at 08:34
  • We used the definition with -Dmaven.repo.local=${CI_PROJECT_DIR}/.m2/repository. But the problem is, that sometimes (!) the CI_PROJECT_DIR is not resolved (not matter whether you use curly brackets or not), which end up in the path /.m2/repository. But you cannot cache anything outside of the project folder. Because of this we use -Dmaven.repo.local=.m2/repository, as the maven commands are always executed withing the project folder(s). – Timz Mar 02 '22 at 15:17
  • I tried the mentioned options in https://gitlab.com/gitlab-org/gitlab-foss/-/blob/master/lib/gitlab/ci/templates/Maven.gitlab-ci.yml and it worked perfectly. – Dany Wehbe Dec 14 '22 at 10:00
17

According to the conversation over on GitLab's issue tracker, I managed to change the Maven local repository path and put it into ./.m2/repository/ directory, which is we will then persist between runs by adding this global block to the CI config:

cache:
  paths:
    - ./.m2/repository
  # keep cache across branch
  key: "$CI_BUILD_REF_NAME"

Unfortunately, according to this StackOverflow answer the maven local repository path can only be set on every run with -Dmaven.repo.local or by editing your settings.xml, which is a tedious task to do in a gitlab-ci configuration script. An option would be to set a variable with the default Maven options and pass it to every run.

Also, it is crucial that the local Maven repository is a child of the current directory. For some reason, putting it in /cache or /builds didn't work for me, even though someone from GitLab claimed it should.

Example of a working gitlab-ci.yml configuration file for Maven + Java:

image: maven:3-jdk-8

variables:
  MAVEN_OPTS: "-Djava.awt.headless=true -Dmaven.repo.local=./.m2/repository"
  MAVEN_CLI_OPTS: "--batch-mode --errors --fail-at-end --show-version"

cache:
  paths:
    - ./.m2/repository
  # keep cache across branch
  key: "$CI_BUILD_REF_NAME"

stages:
  - build
  - test
  - deploy

build-job:
  stage: build
  script:
    - "mvn clean compile $MAVEN_CLI_OPTS"
  artifacts:
    paths:
      - target/

unittest-job:
  stage: test
  dependencies:
    - build-job
  script:
    - "mvn package $MAVEN_CLI_OPTS"
  artifacts:
    paths:
      - target/

integrationtest-job:
  stage: test
  dependencies:
    - build-job
  script:
    - "mvn verify $MAVEN_CLI_OPTS"
  artifacts:
    paths:
      - target/

deploy-job:
  stage: deploy
  artifacts:
    paths:
      - "target/*.jar"
Community
  • 1
  • 1
Ondrej Skopek
  • 778
  • 10
  • 27
11

The accepted answer didn't do it for me.

As zlobster mentioned, the guys at GitLab have this amazing repository where you can find a proper example of the .gitlab-ci.yml file used for Maven projects.

Basically, what you need are these lines:

cache:
  paths:
    - .m2/repository

Keep in mind that if you decide to a add a local cache for a certain job, the global one added above will be replaced. More on this here.

Ionut Ciuta
  • 618
  • 1
  • 9
  • 24
  • 1
    Thanks for pointing out that the local cache overwrites the global one. This caused the .m2 folder to be deleted on each run so that all dependencies will be downloaded again. – ThreeCheeseHigh Jun 09 '20 at 16:43
7

You can add cache folder to gitlab-ci runner configuration and pass it to maven.

/etc/gitlab-runner/config.toml

[[runners]]
...
  [runners.docker]
  ...
   volumes = ["/cache", "/.m2"]
  ...

.gitlab-ci.yml

variables:
  MAVEN_OPTS: "-Dmaven.repo.local=/.m2"

build:
  script:
    - mvn package
Alexey
  • 161
  • 2
  • 4
6

If you are using kubernetes as executor for gitlab-runner, you can also use the maven cache. I chose to have a persistent cache on NFS with k8s PV (but other volume type are supported by gitlab-runner). The following configuration doesn't use the cache gitlab feature because of persistence offered by NFS.

1) create a PersistentVolume on your cluster, ex here with NFS (adapt to your persistence layer and your options) :

apiVersion: v1
kind: PersistentVolume
metadata:
  name: gitlabrunner-nfs-volume
spec:
  capacity:
    storage: 10Gi
  mountOptions:
    - nolock
  accessModes:
    - ReadWriteMany
  persistentVolumeReclaimPolicy: Recycle
  nfs:
    path: /gitlabrunner
    server: 1.2.3.4

2) Reference the PV to get a claim as a volume in the runner pod :

[[runners.kubernetes.volumes.pvc]]
  name = "pvc-1"
  mount_path = "/path/to/mount/point1"

Note (03/09/18) : A command line option for these paramaters doesn't exist yet. There is a open issue.

3) Specify the same path for the gitlab-runner cache :

[[runners]]
  executor = "kubernetes"
  # ...
  cache_dir = "/path/to/mount/point1"

or

--cache-dir "/path/to/mount/point1" in interactive mode

4) use the "/path/to/mount/point1" directory in the -Dmaven.repo.local option

Nicolas Pepinster
  • 5,413
  • 2
  • 30
  • 48
4

I was able to use a host volume to share my .m2 repository directory. This also had the advantage of sharing over my settings.xml file (which not everyone may want). I found this to be faster than using the cache solutions mentioned.

[[runners]]
  [runners.docker]
    volumes = ["/home/<user>/.m2:/root/.m2"]
pez
  • 1,034
  • 2
  • 10
  • 23
2

There is another approach. Do not use gitlab cache and use custom (per project) docker image.

Some details:

First of all, you need to create a maven docker image where all (or most of) required for your project dependencies are presented. Publish it to your registry (gitlab has one) and use it for any job running maven.

To create such an image I usually create an additional job in CI triggered manually. You need to trigger it at initial stage and when project dependencies are heavily modified.

Working sample can be found here:

https://gitlab.com/alexej.vlasov/syncer/blob/master/.gitlab-ci.yml - this project is using the prepared image and also it has a job to prepare this image.

https://gitlab.com/alexej.vlasov/maven/blob/master/Dockerfile - dockerfile to run maven and download dependencies once.

The pros:

  • don't need to download dependencies each time - they are inside a docker image (and docker layers are cached on the runners)
  • don't need to upload artifacts when job is finished
  • cache are not downloaded in jobs don't use maven
Alexey Vlasov
  • 355
  • 2
  • 5
  • Hi Alexey, the links in your answer seem to be down. Could you share the content of these files in your answer? – Jodiug Mar 27 '20 at 13:00
  • 1
    @Jodiug It seems like some GitLab access issue. These projects are public and should be accessible to anyone. Can you try one more time? – Alexey Vlasov Mar 30 '20 at 07:48
  • The second link works now - the first one does not (404). Perhaps the repository has become private? – Jodiug Mar 30 '20 at 11:18
  • 1
    Nope, it's public. I have changed to private and backed it up to the public. Please try now. @Jodiug – Alexey Vlasov Mar 31 '20 at 06:28
1

You don't have to declare MAVEN_OPTS in variables section when you use CI_PROJECT_DIR variable (The full path where the repository is cloned and where the job is run)

cache:
    key: maven-cache
    paths:
    - $CI_PROJECT_DIR/.m2/
yod
  • 121
  • 1
  • 6