10

while dockerizing mlflow , only .trash is getting created beacuse of that in mlflow ui , getting error as "no experiments exists"

dockerfile

FROM python:3.7.0

RUN pip install mlflow==1.0.0

WORKDIR /data

EXPOSE 5000

CMD mlflow server \
    --backend-store-uri /data/ \
    --default-artifact-root /data/ \
    --host 0.0.0.0

docker compose :

  mlflow:
    # builds track_ml Dockerfile
    build:
      context: ./mlflow_dockerfile
    expose: 
      - "5000"
    ports:
      - "5000:5000"
    volumes: 
      - ./data:/data
Adiii
  • 54,482
  • 7
  • 145
  • 148
Akash Kumar
  • 137
  • 2
  • 8

2 Answers2

13

You can use this Dockerfile, Taken from mlflow-workshop which is more generic and support different ENV to debug and working with different version.

By default it will store the artifacts and files inside /opt/mlflow. It's possible to define the following variables:

MLFLOW_HOME (/opt/mlflow)
MLFLOW_VERSION (0.7.0)
SERVER_PORT (5000)
SERVER_HOST (0.0.0.0)
FILE_STORE (${MLFLOW_HOME}/fileStore)
ARTIFACT_STORE (${MLFLOW_HOME}/artifactStore)

Dockerfile

FROM python:3.7.0
LABEL maintainer="Albert Franzi"

ENV MLFLOW_HOME /opt/mlflow
ENV MLFLOW_VERSION 0.7.0
ENV SERVER_PORT 5000
ENV SERVER_HOST 0.0.0.0
ENV FILE_STORE ${MLFLOW_HOME}/fileStore
ENV ARTIFACT_STORE ${MLFLOW_HOME}/artifactStore

RUN pip install mlflow==${MLFLOW_VERSION} && \
    mkdir -p ${MLFLOW_HOME}/scripts && \
    mkdir -p ${FILE_STORE} && \
    mkdir -p ${ARTIFACT_STORE}

COPY scripts/run.sh ${MLFLOW_HOME}/scripts/run.sh
RUN chmod +x ${MLFLOW_HOME}/scripts/run.sh

EXPOSE ${SERVER_PORT}/tcp

VOLUME ["${MLFLOW_HOME}/scripts/", "${FILE_STORE}", "${ARTIFACT_STORE}"]

WORKDIR ${MLFLOW_HOME}

ENTRYPOINT ["./scripts/run.sh"]

scripts/run.sh

#!/bin/sh

mlflow server \
    --file-store $FILE_STORE \
    --default-artifact-root $ARTIFACT_STORE \
    --host $SERVER_HOST \
    --port $SERVER_PORT

Launch MLFlow Tracking Docker

docker build -t my_mflow_image .
docker run -d -p 5000:5000 --name mlflow-tracking my_mflow_image

Run trainings

Since we have our MLflow Tracking docker exposed at 5000, we can log executions by setting the env variable MLFLOW_TRACKING_URI.

MLFLOW_TRACKING_URI=http://localhost:5000 python example.py

Also, better to remove - ./data:/data on first run, debug with out mount, and the suggest dockerfile you might need to mount different path that is mentioned in ENV based on your need.

Adiii
  • 54,482
  • 7
  • 145
  • 148
  • i tried using same approach but i am not able to see machine learning models log in http://localhost:5000/#/ – Akash Kumar Sep 06 '19 at 10:37
  • 1
    run in forground `docker run -it -p 5000:5000 --name mlflow-tracking my_mflow_image` and open browser then – Adiii Sep 06 '19 at 10:38
  • i am dockerizing it using docker-compose yml file as it's having other dockerized application as well docker-compose yml: python_ml: # builds Python_ML Dockerfile build: ./python_ml ports: - "8181:8181" volumes: - ./data:/data mlflow: # builds track_ml Dockerfile build: ./mlflow_dock ports: - "5000:5000" – Akash Kumar Sep 06 '19 at 10:40
  • yes fine with docker-compose but I will suggest to run with above script to make your debugging easy, verify it and then run with docker-compose – Adiii Sep 06 '19 at 10:41
  • i tried docker run -it -p 5000:5000 --name mlflow-tracking XXXXXXXXX got the same issue, i can see the main page with default experiments on localhost:5000 but not able to see any models log , also i cant find Artifact Location:/data/artifactStore/0 (physical location after running the above script) – Akash Kumar Sep 06 '19 at 11:01
  • artifact is store on location `/opt/mlflow/artifactStore` not on `/data/artifactStore/0 ` – Adiii Sep 06 '19 at 11:04
  • yup, i tried using that location also, can't see this file getting created anywhere in my root directory. i just tried using different location. it's not working. only the mlflow service is getting activated – Akash Kumar Sep 06 '19 at 11:09
  • 1
    Its important to understand that the artifacts are not being proxied over the server but written directly from the client ot the store. This means the client need access to the artifact store. But in ordre for you to see the artifacts from the servers UI the servers UI also needs direct access to whereever you store them. – Simon Oct 25 '20 at 09:19
0

Here is a link to Github where I put MLflow in a docker that uses azurite in the background to also pull the models later from it.

As a short notification, you need to give your script how ever you execute it the address where it should save the artifacts. You can do this with .env files or set these things manually.

set MLFLOW_TRACKING_URI=http://localhost:5000

Important is to also give these information not only your docker but also the script for the model training ;)

Here you can find a complete tutorial how to use MLflow and SKlearn together in different theoretical szenarios since it is also a bit tricky later on.

I hope you get enough inspiration how to use it.

heiko
  • 267
  • 1
  • 2
  • 5