Summarize the problem:
The Python package basically opens PDFs in batch folder, reads the first page of each PDF, matches keywords, and dumps compatible PDFs in source folder for OCR scripts to kick in. The first script to take all PDFs are MainBankClass.py. I am trying to use a docker-compose file to include all these python scripts under the same network and volume so that each OCR script starts to scan bank statements when the pre-processing is done. This link is the closest so far to accomplish the goal but it seems that I missed some parts of it. The process to call different OCR scripts is achieved by runpy.run_path(path_name='ChaseOCR.py')
, thus these scripts are in the same directory of __init__.py
. Here is the filesystem structure:
BankStatements
┣ BankofAmericaOCR
┃ ┣ BancAmericaOCR.py
┃ ┗ Dockerfile.bankofamerica
┣ ChaseBankStatementOCR
┃ ┣ ChaseOCR.py
┃ ┗ Dockerfile.chase
┣ WellsFargoStatementOCR
┃ ┣ Dockerfile.wellsfargo
┃ ┗ WellsFargoOCR.py
┣ BancAmericaOCR.py
┣ ChaseOCR.py
┣ Dockerfile
┣ WellsFargoOCR.py
┣ __init__.py
┗ docker-compose.yml
What I've tried so far:
In docker-compose.yml:
version: '3'
services:
mainbankclass_container:
build:
context: '.'
dockerfile: Dockerfile
volumes:
- /Users:/Users
#links:
# - "chase_container"
# - "wellsfargo_container"
# - "bankofamerica_container"
chase_container:
build: .
working_dir: /app/ChaseBankStatementOCR
command: ./ChaseOCR.py
volumes:
- /Users:/Users
bankofamerica_container:
build: .
working_dir: /app/BankofAmericaOCR
command: ./BancAmericaOCR.py
volumes:
- /Users:/Users
wellsfargo_container:
build: .
working_dir: /app/WellsFargoStatementOCR
command: ./WellsFargoOCR.py
volumes:
- /Users:/Users
And each dockerfile under each bank folder is similar except CMD
would be changed accordingly. For example, in ChaseBankStatementOCR folder:
FROM python:3.7-stretch
WORKDIR /app
COPY . /app
CMD ["python3", "ChaseOCR.py"] <---- changes are made here for the other two bank scripts
The last element is for Dockerfile outside of each folder:
FROM python:3.7-stretch
WORKDIR /app
COPY ./requirements.txt ./
RUN pip3 install --upgrade pip
RUN pip3 install -r requirements.txt
RUN pip3 install --upgrade PyMuPDF
COPY . /app
COPY ./ChaseOCR.py /app
COPY ./BancAmericaOCR.py /app
COPY ./WellsFargoOCR.py /app
EXPOSE 8080
CMD ["python3", "MainBankClass.py"]
After running docker-compose build
, containers and network are successfully built. Error occurs when I run docker run -v /Users:/Users: python3 python3 ~/BankStatementsDemoOCR/BankStatements/MainBankClass.py
and the error message is FileNotFoundError: [Errno 2] No such file or directory: 'BancAmericaOCR.py'
I am assuming that the container doesn't have BancAmericaOCR.py but I have composed each .py file under the same network and I don't think links
is a good practice since docker recommended to use networks
here. What am I missing here? Any help is much appreciated. Thanks in advance.