7

My goal:

I have a built docker image and want to run all my Flows on that image.

Currently:

I have the following task which is running on a Local Dask Executor. The server on which the agent is running is a different python environment from the one needed to execute my_task - hence the need to run inside a pre-build image.

My question is: How do I run this Flow on a Dask Executor such that it runs on the docker image I provide (as environment)?

import prefect
from prefect import task, Flow
from prefect.engine.executors import LocalDaskExecutor
from prefect.environments import LocalEnvironment


@task
def hello_task():
    logger = prefect.context.get("logger")
    logger.info("Hello, Docker!")


with Flow("My Flow") as flow:
    results = hello_task()

flow.environment = LocalEnvironment(
    labels=[], executor=LocalDaskExecutor(scheduler="threads", num_workers=2),
)

I thought that I need to start the server and the agent on that docker image first (as discussed here), but I guess there can be a way to simply run the Flow on a provided image.

Update 1

Following this tutorial, I tried the following:

import prefect
from prefect import task, Flow
from prefect.engine.executors import LocalDaskExecutor
from prefect.environments import LocalEnvironment
from prefect.environments.storage import Docker


@task
def hello_task():
    logger = prefect.context.get("logger")
    logger.info("Hello, Docker!")


with Flow("My Flow") as flow:
    results = hello_task()

flow.storage = Docker(registry_url='registry.gitlab.com/my-repo/image-library')
flow.environment = LocalEnvironment(
    labels=[], executor=LocalDaskExecutor(scheduler="threads", num_workers=2),
)

flow.register(project_name="testing")

But this created an image which it then uploaded to the registry_url provided. Afterwards when I tried to run the registered task, it pulled the newly created image and the task is stuck in status Submitted for execution for minutes now.

I don't understand why it pushed an image and then pulled it? Instead I already have an image build on this registry, I'd like to specify an image which should be used for task execution.

blong
  • 2,815
  • 8
  • 44
  • 110
Newskooler
  • 3,973
  • 7
  • 46
  • 84
  • The docs explain this in detail, e.g., [here is a short tutorial](https://docs.prefect.io/orchestration/tutorial/docker.html). If you are running Prefect Server yourself, you'll need to make sure that the Docker container has network access to your Server API. – chriswhite Oct 07 '20 at 15:51
  • I read this, but it mentions that the `registry_url` is for pushing to a registry (which I find confusing). "If you do specify a registry URL then the image will be pushed to a container registry upon flow registration." Is the `registry_url` the url of my image essentially (the one I would like to run)? – Newskooler Oct 07 '20 at 16:03
  • Docker images are typically stored in [Docker registries](https://docs.docker.com/registry/introduction/) - if you don't provide a `registry_url`, the built image will be kept locally on the machine on which it was built. – chriswhite Oct 07 '20 at 16:06
  • Yes, i have my images in a registry (GitLab in my case). Why do I need to provide a registry link and not the specific image link? – Newskooler Oct 07 '20 at 16:10
  • Because the image hasn't been built yet; if you want to specify both the image name and tag (instead of using Prefect's defaults), you can do so via the `image_name` and `image_tag` kwargs on `Docker` storage – chriswhite Oct 07 '20 at 16:17
  • I guess I am not following. I have a built image in a docker register. For example: `registry.gitlab.com/my-repo/image-library:v1.3` Then I want to run my Flow inside this image. To do this I usually need pull the image then run it and then execute some command. I followed the steps in the link you suggested and I don't understand why does it create a new image which is added to my registry and then pull the new image (instead the one I want)? I will update the question now, for clarity. – Newskooler Oct 07 '20 at 17:25

1 Answers1

3

The way i ended up achieve this is as follows:

  1. Run prefect server start on the server (i.e. not inside docker). Apparently docker-compose in docker is not a good idea.
  2. Run prefect agent start inside the docker image
  3. Make sure the flows are accessible by the docker image (i.e. by mounting a shared volume between the image and the server for example)

You can see the source of my answer here.

Newskooler
  • 3,973
  • 7
  • 46
  • 84