6

For test purposes I create and run an Azurite docker image, in a test pipeline. I would like to have the blob container automatically created though after Azurite is started, as it would simplify things.

Is there any good way to achieve this?

For the Postgres image we use, we can specify an init.sql which is run on startup. If something similar is available for Azurite, that would be awesome.

user3533716
  • 491
  • 7
  • 14

2 Answers2

3

I've solved the issue by creating a custom docker image and executing azure-cli tools from a health check. There could certainly be better solutions, and I will update the accepted answer if someone posts a better solution.

In more details

A solution to create the required data on startup is to run my own script. I chose to trigger the script from a health check I defined in docker-compose. What it does is use azure cli tools to create a container and then verify that it exists.

The script:

AZURE_STORAGE_CONNECTION_STRING="UseDevelopmentStorage=true"
export AZURE_STORAGE_CONNECTION_STRING
az storage container create -n images
az storage container show -n images
exit $?

However, the azurite image is based on alpine, which doesn't have apt, so installing azure cli was a bit tricky. So I did it the other way around, and based my image on mcr.microsoft.com/azure-cli:latest. With that done I installed Azurite like this:

RUN apk add npm
RUN npm install -g azurite --silent

All that's left is to actually run azurite, see the official azurite dockerfile for details.

It is possible to do this without azure-cli and use curl instead (and with that, not having to use the azure-cli docker image). However this was a bit complicated to get the authentication header working properly, so using azure-cli was easier.

user3533716
  • 491
  • 7
  • 14
3

You can use the following Dockerfile to install the azure-storage-blob Python package on the Alpine based azurite image. The resulting image size is ~400MB compared to the ~1.2GB azure-cli image.

ARG AZURITE_VERSION="3.17.0"
FROM mcr.microsoft.com/azure-storage/azurite:${AZURITE_VERSION}

# Install azure-storage-blob python package
RUN apk update && \
    apk --no-cache add py3-pip && \
    apk add --virtual=build gcc libffi-dev musl-dev python3-dev && \
    pip3 install --upgrade pip && \
    pip3 install azure-storage-blob==12.12.0

# Copy init_azurite.py script
COPY ./init_azurite.py init_azurite.py

# Copy local blobs to azurite
COPY ./init_containers init_containers

# Run the blob emulator and initialize the blob containers
CMD python3 init_azurite.py --directory=init_containers & \
    azurite-blob --blobHost 0.0.0.0 --blobPort 10000

The init_azurite.py script is a local Python script that uses the azure-storage-blob package to batch upload files and directories to the azurite blob storage emulator.

import argparse
import os
from time import sleep

from azure.core.exceptions import ResourceExistsError
from azure.storage.blob import BlobServiceClient, ContainerClient


def upload_file(container_client: ContainerClient, source: str, dest: str) -> None:
    """
    Upload a single file to a path inside the container.
    """
    print(f"Uploading {source} to {dest}")
    with open(source, "rb") as data:
        try:
            container_client.upload_blob(name=dest, data=data)
        except ResourceExistsError:
            pass


def upload_dir(container_client: ContainerClient, source: str, dest: str) -> None:
    """
    Upload a directory to a path inside the container.
    """
    prefix = "" if dest == "" else dest + "/"
    prefix += os.path.basename(source) + "/"
    for root, dirs, files in os.walk(source):
        for name in files:
            dir_part = os.path.relpath(root, source)
            dir_part = "" if dir_part == "." else dir_part + "/"
            file_path = os.path.join(root, name)
            blob_path = prefix + dir_part + name
            upload_file(container_client, file_path, blob_path)

def init_containers(
    service_client: BlobServiceClient, containers_directory: str
) -> None:
    """
    Iterate on the containers directory and do the following:
    1- create the container.
    2- upload all folders and files to the container.
    """
    for container_name in os.listdir(containers_directory):
        container_path = os.path.join(containers_directory, container_name)
        if os.path.isdir(container_path):
            container_client = service_client.get_container_client(container_name)
            try:
                container_client.create_container()
            except ResourceExistsError:
                pass
            for blob in os.listdir(container_path):
                blob_path = os.path.join(container_path, blob)
                if os.path.isdir(blob_path):
                    upload_dir(container_client, blob_path, "")
                else:
                    upload_file(container_client, blob_path, blob)


if __name__ == "__main__":
    parser = argparse.ArgumentParser(
        description="Initialize azurite emulator containers."
    )
    parser.add_argument(
        "--directory",
        required=True,
        help="""
        Directory that contains subdirectories named after the 
        containers that we should create. Each subdirectory will contain the files
         and directories of its container.
        """
    )

    args = parser.parse_args()

    # Connect to the localhost emulator (after 5 secs to make sure it's up).
    sleep(5)
    blob_service_client = BlobServiceClient(
        account_url="http://localhost:10000/devstoreaccount1",
        credential={
            "account_name": "devstoreaccount1",
            "account_key": (
                "Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq"
                "/K1SZFPTOtr/KBHBeksoGMGw=="
            )
        }
    )

    # Only initialize if not already initialized.
    if next(blob_service_client.list_containers(), None):
        print("Emulator already has containers, will skip initialization.")
    else:
        init_containers(blob_service_client, args.directory)

This script will be copied to the azurite container and will populate the initial blob containers every time the azurite container is started unless some containers were already persisted using docker volumes. In that case, nothing will happen.

Following is an example docker-compose.yml file:

azurite:
  build:
    context: ./
    dockerfile: Dockerfile
    args:
      AZURITE_VERSION: 3.17.0
  restart: on-failure
  ports:
    - 10000:10000
  volumes:
    - azurite-data:/opt/azurite

volumes:
  azurite-data:

Using such volumes will persist the emulator data until you destroy them (e.g. by using docker-compose down -v).

Finally, init_containers is a local directory that contains the containers and their folders/files. It will be copied to the azurite container when the image is built.

For example:

init_containers:
   container-name-1:
     dir-1:
       file.txt
       img.png
     dir-2:
       file.txt
   container-name-2:
     dir-1:
       file.txt
     img.png