0

I was really happy when I first discovered Docker. But I keep running into this issue, in which Docker builds successfully initially, but if I try to re-build a container after a few months, it fails to build, even though I made no changes to the Dockerfile and docker-compose.yml files.

I suspect that external dependencies (in this case, some mariadb package necessary for cron) may become inaccessible and therefore breaking the Docker build process. Has anyone also encountered this problem? Is this a common problem with Docker? How do I get around it?

Here's the error message.

E: Failed to fetch http://deb.debian.org/debian/pool/main/m/mariadb-10.3/mariadb-common_10.3.29-0+deb10u1_all.deb  404  Not Found [IP: 151.101.54.132 80]
E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?
Fetched 17.6 MB in 0s (48.6 MB/s)
ERROR: Service 'python_etls' failed to build: The command '/bin/sh -c apt-get install cron -y' returned a non-zero code: 100

This is my docker-compose.yml.

## this docker-compose file is for development/local environment
version: '3'

services:
  python_etls:
    container_name: python_etls
    restart: always
    build: 
        context: ./python_etls
        dockerfile: Dockerfile

This is my Dockerfile.

#FROM python:3.8-slim-buster  # debian gnutls28 fetch doesn't work in EC2 machine
FROM python:3.9-slim-buster

RUN apt-get update


## install 
RUN apt-get install nano -y

## define working directory
ENV CONTAINER_HOME=/home/projects/python_etls

## create working directory
RUN mkdir -p $CONTAINER_HOME

## create directory to save logs
RUN mkdir -p $CONTAINER_HOME/logs

## set working directory
WORKDIR $CONTAINER_HOME

## copy source code into the container
COPY . $CONTAINER_HOME



## install python modules through pip
RUN pip install snowflake snowflake-sqlalchemy sqlalchemy pymongo dnspython numpy pandas python-dotenv xmltodict appstoreconnect boto3
# pip here is /usr/local/bin/pip, as seen when running 'which pip' in command line, and installs python packages for /usr/local/bin/python
# https://stackoverflow.com/questions/45513879/trouble-running-python-script-cron-import-error-no-module-named-tweepy

## changing timezone
ENV TZ=America/Los_Angeles
RUN ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && echo $TZ > /etc/timezone



## scheduling with cron
## reference: https://stackoverflow.com/questions/37458287/how-to-run-a-cron-job-inside-a-docker-container

# install cron
RUN apt-get install cron -y

# Copy crontable file to the cron.d directory
COPY crontable /etc/cron.d/crontable

# Give execution rights on the cron job
RUN chmod 0644 /etc/cron.d/crontable

# Apply cron job
RUN crontab /etc/cron.d/crontable

# Create the log file to be able to run tail
#RUN touch /var/log/cron.log

# Run the command on container startup
#CMD cron && tail -f /var/log/cron.log
CMD ["cron", "-f"]

EDIT: So changing the Docker base image from python:3.9-slim-buster to python:3.10-slim-buster in Dockerfile allowed a successful build. However, I'm still stumped why it broke in the first place and what I should do to mitigate this problem in the future.

Howard S
  • 121
  • 6
  • 1
    You must `RUN apt-get update && apt-get install ...` in a single command; otherwise Docker's image caching will cache the (old) `update` results and fail to install packages whose versions have changed. [In Docker, apt-get install fails with "Failed to fetch http://archive.ubuntu.com/ ... 404 Not Found" errors. Why? How can we get past it?](https://stackoverflow.com/questions/37706635/in-docker-apt-get-install-fails-with-failed-to-fetch-http-archive-ubuntu-com) describes this a little more. – David Maze Dec 21 '21 at 02:38
  • @DavidMaze Wow, that helped a lot! It totally works with `RUN apt-get update && apt-get install ...`. Thank you so much! – Howard S Dec 21 '21 at 05:56

1 Answers1

2

You may have the misconception that docker will generate the same image every time you try to build from the same Dockerfile. It doesn't, and it's not docker's problem space.

When you build an image, you most likely reach external resources: packages, repository, registries, etc. All things that could be reachable today but missing tomorrow. A security update may be released, and some old packages deleted as consequence of it. Package managers may request the newest version of specific library, so the release of an update will impact the reproducibility of your image.

You can try to minimise this effect by pinning the version of as many dependencies as you can. There are also tools, like nix, that allow you to actually reach full reproducibility. If that's important to you, I'd point you in that direction.

Federkun
  • 36,084
  • 8
  • 78
  • 90