4

I am trying to build and run my Docker image using Gitlab CI/CD, but there is one issue I can't fix even though locally everything works well.

Here's my Dockerfile:

FROM <internal_docker_repo_image>
RUN apt update && \
    apt install --no-install-recommends -y build-essential gcc
COPY requirements.txt /requirements.txt

RUN pip install --no-cache-dir --user -r /requirements.txt

COPY . /src
WORKDIR /src
ENTRYPOINT ["python", "-m", "dvc", "repro"]

This is how I run the container:

docker run --volume ${PWD}:/src --env=GOOGLE_APPLICATION_CREDENTIALS=<path_to_json> <image_name> ./dvc_configs/free/dvc.yaml --force

Everything works great when running this locally, but it fails when run on Gitlab CI/CD.

stages:
  - build_image

build_image:
  stage: build_image
  image: <internal_docker_repo_image>
  script:
    - echo "Building Docker image..."
    - mkdir ~/.docker
    - cat $GOOGLE_CREDENTIALS > ${CI_PROJECT_DIR}/key.json
    - docker build . -t <image_name>
    - docker run --volume ${PWD}:/src --env=GOOGLE_APPLICATION_CREDENTIALS=<path_to_json> <image_name> ./dvc_configs/free/dvc.yaml --force
  artifacts:
        paths:
          - "./data/*csv"
        expire_in: 1 week

This results in the following error: ERROR: you are not inside of a DVC repository (checked up to mount point '/src')

Just in case you don't know what DVC is, this is a tool used in machine learning for versioning your models, datasets, metrics, and, in addition, setting up your pipelines, which I use it for in my case.

Essentially, it requires two folders .dvc and .git in the directory from which dvc repro is executed.

In this particular case, I have no idea why it's not able to run this command given that the contents of the folders are exactly the same and both .dvc and .git exist.

Thanks in advance!

Don Draper
  • 463
  • 7
  • 21

1 Answers1

5

Your COPY . /src is problematic for the same reason as Hidden file .env not copied using Docker COPY. You probably need !.dvc in your .dockerignore.

Additionally, docker run --volume ${PWD}:/src will overwrite the container's /src so $PWD itself will need .git & .dvc etc. You don't seem to have cloned a repo before running these script commands.

casper.dcl
  • 13,035
  • 4
  • 31
  • 32
  • This doesn't help, unfortunately. The strange thing is that the same Dockerfile and `docker run` yield expected results without Gitlab CI/CD, which I don't think would be the case if the issue was related to the `dockerignore` file. If I change this part `--volume ${PWD}:/src` to anything else (except `src`), the pipeline succeeds. – Don Draper May 12 '21 at 15:48
  • Thanks! This makes sense, but why does it work for the `docker run` without Gitlab CI/CD. The same command, different results. – Don Draper May 12 '21 at 16:32
  • My understanding is that `${PWD}` has all the files pushed to the repo so after mounting we effectively access this directory whenever `/src` is used inside the container. I might misunderstand how it really works of course, but the question remains: why does it work outside CI/CD? – Don Draper May 12 '21 at 16:38
  • @DonDraper I doubt the files are in the CI/CD since I don't see any clone step in your config so don't know what is/isn't included. You could try `ls -a`. – casper.dcl May 12 '21 at 17:36
  • @casper.dci, `ls -a` shows that the current directory has all the necessary folders and files – Don Draper May 12 '21 at 18:11