I need to create a bacth-processing-data-intensive application. I have created synthetic data, which is saved in "data" and after running my script in "data_ingestion", I have also the database in "db". The application is running with flask. I have written microservice python scripts for "data_ingestion", "data_processing" and "data_aggregation". Now, I created a dockerfile for data_aggregation, and the image was created without any issues. When I am doing it for data_ingestion, I am seeing the "file not found error".
Dockerfile:
# Use an official Python runtime as the base image
FROM python:3.9
# Set the working directory in the container
WORKDIR /data2
# Copy the CSV file and the database file to the container
COPY ./data/financial_data.csv ./data/
COPY ./db/financial_data.db ./db/
COPY ./data_ingestion.py .
# Expose port 5000 (or any other port your Flask app is listening on)
EXPOSE 5000
# Run the Flask app when the container launches
CMD ["python", "data_ingestion.py"]`
Error:
=> [internal] load .dockerignore 0.1s
=> => transferring context: 2B 0.0s
=> [internal] load build definition from Dockerfile 0.1s
=> => transferring dockerfile: 511B 0.1s
=> [internal] load metadata for docker.io/library/python:3.9 1.4s
=> [1/5] FROM docker.io/library/python:3.9@sha256:9ba 0.4s
=> => resolve docker.io/library/python:3.9@sha256 0.4s
=> [internal] load build context 0.1s
=> => transferring context: 39B 0.0s
=> CACHED [2/5] WORKDIR /data2 0.0s
=> CACHED [3/5] COPY ./data/financial_data.csv ./data/ 0.0s
=> ERROR [4/5] COPY ./db/financial_data.db ./db/ 0.0s
------
> [4/5] COPY ./db/financial_data.db ./db/:
------
Dockerfile:9
--------------------
7 | # Copy the CSV file and the database file to the container
8 | COPY ./data/financial_data.csv ./data/
9 | >>> COPY ./db/financial_data.db ./db/
10 | COPY ./data_ingestion.py .
11 |
--------------------
ERROR: failed to solve: failed to compute cache key: failed to calculate checksum of ref moby::mvomn3m1y0lfugm6sk6fdvdiw: "/db/financial_data.db": not found`
My folder structure:
C:.
│ .gitignore
│ bashlog4docker.txt
│ Contributing.md
│ docker-compose.yml
│ License.md
│ README.md
│ redundant_file.txt
│
├───app
│ │ app.py
│ │ Readme.txt
│ │
│ ├───static
│ ├───templates
│ │ data_aggregation_results.html
│ │ data_ingestion_results.html
│ │ error.html
│ │ index.html
│ │ processed_data_results.html
│ │ success.html
│ │
│ ├───tests
│ │ │ test_app.py
│ │ │ __init__.py
│ │ │
│ │ ├───.pytest_cache
│ │ │ │ .gitignore
│ │ │ │ CACHEDIR.TAG
│ │ │ │ README.md
│ │ │ │
│ │ │ └───v
│ │ │ └───cache
│ │ │ lastfailed
│ │ │ nodeids
│ │ │ stepwise
│ │ │
│ │ └───__pycache__
│ │ test_app.cpython-39-pytest-6.2.5.pyc
│ │ __init__.cpython-39.pyc
│ │
│ └───__pycache__
│ app.cpython-39.pyc
│
├───data
│ financial_data.csv
│
├───data_aggregation
│ │ data_aggregation.log
│ │ data_aggregation.py
│ │ Dockerfile
│ │ Readme.txt
│ │ requirements.txt
│ │ __init__.py
│ │
│ ├───tests
│ │ │ test_data_aggregation.py
│ │ │ __init__.py
│ │ │
│ │ └───__pycache__
│ │ test_data_aggregation.cpython-310.pyc
│ │ __init__.cpython-310.pyc
│ │
│ └───__pycache__
│ data_aggregation.cpython-310.pyc
│ data_aggregation.cpython-39.pyc
│ data_aggregation_abs_path.cpython-39.pyc
│ __init__.cpython-39.pyc
│
├───data_ingestion
│ │ data_ingestion.log
│ │ data_ingestion.py
│ │ Dockerfile
│ │ Readme.txt
│ │ requirements.txt
│ │ __init__.py
│ │
│ ├───tests
│ │ │ test_data_ingestion.py
│ │ │ __init__.py
│ │ │
│ │ └───__pycache__
│ │ test_data_ingestion.cpython-310.pyc
│ │
│ └───__pycache__
│ data_ingestion.cpython-310.pyc
│ data_ingestion.cpython-39.pyc
│ data_ingestion_abs_path.cpython-39.pyc
│ __init__.cpython-39.pyc
│
├───data_processing
│ │ data_processing.log
│ │ data_processing.py
│ │ Dockerfile
│ │ Readme.txt
│ │ requirements.txt
│ │ __init__.py
│ │
│ ├───tests
│ │ │ test_data_processing.py
│ │ │ __init__.py
│ │ │
│ │ └───__pycache__
│ │ test_data_processing.cpython-310.pyc
│ │ __init__.cpython-39.pyc
│ │
│ └───__pycache__
│ data_preprocessing_abs_path.cpython-39.pyc
│ data_processing.cpython-310.pyc
│ data_processing.cpython-39.pyc
│ __init__.cpython-39.pyc
│
├───db
│ Dockerfile
│ financial_data.db
│
└───__pycache__
1_data_ingestion.cpython-39.pyc
2_data_preprocessing.cpython-39.pyc
3_data_aggregation.cpython-39.pyc
6_data_storage_retrieval.cpython-39.pyc
__1_data_ingestion.cpython-39.pyc
__2_data_preprocessing.cpython-39.pyc
__3_data_aggregation.cpython-39.pyc
__4_data_validation.cpython-39.pyc
__5_data_analysis.cpython-39.pyc
I know, my structure / folders are not conventional. This is my first assignment from the university. What I need to know: why can't I create an image, with the Dockerfile? Why is it not throwing the error for csv but only for db? Is docker not able to copy files from other folders? Because when I copy them manually in ".dat_ingestion", the image is created; although when I run the image, it is stopped after some seconds. Don't know why.
Happy for any constructive suggestion and solution. Thank you in advance!
I have tried to change path structures but nothing really worked. I have googled the web for my problem but with "dockerfile" the results are not precise enough. The documentation from docker says to choose "from...copy" but it also did not work.