4

I'm attempting a multi-stage container build to try and keep my image smaller. The offending package is numpy which apparently doesn't play nicely with Alpine.

My error from numpy:

>>> import numpy 
Traceback (most recent call last):
  File "/opt/venv/lib/python3.8/site-packages/numpy/core/__init__.py", line 22, in <module>
    from . import multiarray
  File "/opt/venv/lib/python3.8/site-packages/numpy/core/multiarray.py", line 12, in <module>
    from . import overrides
  File "/opt/venv/lib/python3.8/site-packages/numpy/core/overrides.py", line 7, in <module>
    from numpy.core._multiarray_umath import (
ImportError: Error loading shared library ld-linux-x86-64.so.2: No such file or directory (needed by /opt/venv/lib/python3.8/site-packages/numpy/core/_multiarray_umath.cpython-38-x86_64-linux-gnu.so)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/venv/lib/python3.8/site-packages/numpy/__init__.py", line 145, in <module>
    from . import core
  File "/opt/venv/lib/python3.8/site-packages/numpy/core/__init__.py", line 48, in <module>
    raise ImportError(msg)
ImportError: 

IMPORTANT: PLEASE READ THIS FOR ADVICE ON HOW TO SOLVE THIS ISSUE!

Importing the numpy C-extensions failed. This error can happen for
many reasons, often due to issues with your setup or how NumPy was
installed.

We have compiled some common reasons and troubleshooting tips at:

    https://numpy.org/devdocs/user/troubleshooting-importerror.html

Please note and check the following:

  * The Python version is: Python3.8 from "/opt/venv/bin/python"
  * The NumPy version is: "1.20.3"

and make sure that they are the versions you expect.
Please carefully study the documentation linked above for further help.

Original error was: Error loading shared library ld-linux-x86-64.so.2: No such file or directory (needed by /opt/venv/lib/python3.8/site-packages/numpy/core/_multiarray_umath.cpython-38-x86_64-linux-gnu.so)

Here is my Dockerfile:

FROM python:3.8 AS builder

RUN apt-get update && \
    apt-get install -y --no-install-recommends build-essential gcc

COPY requirements.txt .

ENV LANG=C.UTF-8 \
    PYTHONDONTWRITEBYTECODE=1 \
    PYTHONUNBUFFERED=1 \
    VIRTUAL_ENV=/opt/venv \
    PATH="/opt/venv/bin:$PATH" \
    PIP_DISABLE_PIP_VERSION_CHECK=1

RUN python3 -m venv $VIRTUAL_ENV
RUN pip3 install --requirement requirements.txt


FROM python:3.8-alpine AS Production

RUN apk update && \
    apk add --no-cache libc6-compat libexecinfo-dev musl-dev g++ gfortran linux-headers && \
    python3 -m venv /opt/venv && \
    adduser -D worker

USER worker
WORKDIR /home/worker

ENV LANG=C.UTF-8 \
  PYTHONDONTWRITEBYTECODE=1 \
  PYTHONUNBUFFERED=1 \
  VIRTUAL_ENV=/opt/venv \
  PATH="/opt/venv/bin:$PATH" 

COPY --chown=worker:worker --from=builder /opt/venv /opt/venv
COPY --chown=worker:worker ./src /home/worker

CMD ["sleep", "100000"]
#  ENTRYPOINT ["gunicorn"]
#  CMD ["--bind", "0.0.0.0:8080", "--workers", "2", "myapp.__main__:app"]

I tried adding: apk add --no-cache libc6-compat libexecinfo-dev musl-dev g++ gfortran linux-headers which I saw on a related SO question. They installed numpy directly to their alpine image but I'm copying numpy over from a build container so it doesn't seem to help.

If I use python:3.8-slim instead of python:3.8-alpine it seems to work but the image is not as small. ld-linux-x86-64.so.2 is missing from the Alpine container but I cannot figure out how to get it or why it is not copied from the build image.

requirements.txt:

numpy==1.20.3
scipy==1.6.3
lwileczek
  • 2,084
  • 18
  • 27

2 Answers2

4

From looking at your Dockerfile, I would suggest against using alpine and multi-stage builds and instead install pre-compiled wheels in a debian-based python image. The Dockerfile below installs your requirements without having to compile anything. The build time is fast, and the size is relatively small. The -slim image does not include build tools, so the image is smaller.

FROM python:3.8-slim
ENV VIRTUAL_ENV="/opt/venv"
ENV PATH="$VIRTUAL_ENV/bin:$PATH"
RUN python3 -m venv $VIRTUAL_ENV \
    && pip3 install --no-cache-dir \
        numpy==1.20.3 \
        scipy==1.6.3

If you do insist on using alpine, please continue reading, I address building a multi-stage alpine image in the following paragraph. Be advised that the size advantage is not great... The debian-based image above is 284 MB and the alpine-based image below is 211 MB.

The problem is that you install numpy in a debian-based image, and then you copy that into an alpine image. Alpine uses musl C whereas debian and other linux distributions use glibc. They are not compatible. Numpy does not ship pre-compiled wheels for musl C, so if you want to use alpine, you will have to compile numpy. I have included a minimal dockerfile that shows how. It can take over 20 minutes for the image to build, because numpy and scipy must be compiled from source.

# Define these variables once and use throughout the dockerfile.
# This reduces chance of bugs...
ARG BASE_IMAGE="python:3.8-alpine"
ARG VIRTUAL_ENV="/opt/venv"

FROM $BASE_IMAGE AS builder
ARG VIRTUAL_ENV
ENV VIRTUAL_ENV=$VIRTUAL_ENV \
    PATH="$VIRTUAL_ENV/bin:$PATH"
RUN apk add --no-cache \
        build-base \
        gcc \
        gfortran \
        openblas-dev \
    && python3 -m venv $VIRTUAL_ENV \
    && pip3 install --no-cache-dir \
        numpy==1.20.3 \
        scipy==1.6.3

FROM $BASE_IMAGE AS production
ARG VIRTUAL_ENV
COPY --from=builder $VIRTUAL_ENV $VIRTUAL_ENV
ENV VIRTUAL_ENV=$VIRTUAL_ENV \
    PATH="$VIRTUAL_ENV/bin:$PATH"
RUN apk add --no-cache openblas
jkr
  • 17,119
  • 2
  • 42
  • 68
  • Someone else going further on the troubles of transferring from Debian to Alpine: https://stackoverflow.com/questions/66963068/docker-alpine-executable-binary-not-found-even-if-in-path#answer-66974607 – lwileczek May 25 '21 at 21:48
0

This might be related to linking problem, which appears as 'not found' many times in Alpine Linux when something is wrong with symlinks. When you build your numpy dependencies in the Debian based distro, it is linked to glibc in specific path. Usually, similar paths are also in Alpine. I'm not sure how this works with venv.

I would suggest that, as you managed (I think) to install numpy directly in Alpine based container, to use Alpine distro as builder as well. Otherwise you might need to change some linked paths manually.

If you look for example folder /lib64 in Alpine, the ld-linux-x86-64.so.2 should be in there, so it is not totally missing. (after installing libc6-compat package).

But in general, Alpine is not the best choice for Python as the programming language is based on C, and musl is not perfect. Alpine could be also much slower for Python

Niklas
  • 1,480
  • 4
  • 10