0

Summarize the problem

I am working on a project to incorporate local python program with docker. The local python program takes PDFs from local directory as input to kick off the program. I think volume would be the solution to share storage between the host and container but I wasn't able to successfully run local python program after mounting the filesystem.

Question: How to run python file in docker after mounting the local filesystem?

Current solution:

This is what I've tried so far:

local file directory:

docker_test_app
 ┣ .vscode
 ┃ ┣ .ropeproject
 ┃ ┃ ┣ config.py
 ┃ ┃ ┗ objectdb
 ┃ ┗ settings.json
 ┣ src
 ┃ ┣ .DS_Store
 ┃ ┣ BankofAmerica_Statement_9.pdf
 ┃ ┣ Chase_Statement_5.pdf
 ┃ ┣ Wells_Statement_7.pdf
 ┃ ┣ requirements.txt
 ┃ ┗ test.py
 ┣ .DS_Store
 ┗ Dockerfile

Dockerfile:

FROM python:3
WORKDIR /app
COPY src/requirements.txt ./
RUN pip install --upgrade pip
RUN pip install -r requirements.txt
COPY src /app
EXPOSE 8080
CMD ["python","test.py"]

A simple python test.py to take PDFs as input:

import fitz
import glob

bank_names_list = ['Chase Bank','bankofamerica','Your Business and Wells Fargo'] # match docs 
chase_match = bank_names_list[0] 
boa_match = bank_names_list[1] 
wells_match = bank_names_list[2] 


fileFolder = '/Users/xyz/Desktop/docker_test_app/src/' # host filesystem 
subFiles = glob.glob(fileFolder + r'*.pdf') 

for i in subFiles:
    doc = fitz.open(i)
    page = doc.loadPage(0)
    pageTextblocks = page.getText('blocks')
    pageTextblocks.sort(key=lambda block: block[3])
    for block in pageTextblocks:
        targetBlock = block[4]
        if chase_match in targetBlock:
            print('hello Chase')

Docker commands that I ran in vs code terminal:

docker volume create TestVolume
docker build -t pythontest .
docker run -it -v /Users/xyz/Desktop/docker_test_app/src:/projects bash

After running the command, it took me directly to bash terminal and I ran ls projects where I could see all three PDFs. After I exit out of bash terminal, I ran docker start reverent_curran and docker exec -it reverent_curran bash but nothing returned from the python program. I expect to have 'hello Chase' printed.

I feel like I am missing some components here to get it working. What did I missing here? Docs as references: docker docs and another similar question on SO Thanks for any help

EDIT:

Have tried to modify python script as suggested but still nothing returned from the program. What are other options here to achieve the goal?

EDIT 2:

Here is the error reproduced from running docker exec -it nifty_keller bash projects/test.py

projects/test.py: line 3: import: command not found
projects/test.py: line 4: import: command not found
projects/test.py: line 5: import: command not found
projects/test.py: line 6: import: command not found
projects/test.py: line 8: bank_names_list: command not found
projects/test.py: line 9: chase_match: command not found
projects/test.py: line 10: boa_match: command not found
projects/test.py: line 11: wells_match: command not found
projects/test.py: line 13: fileFolder: command not found
projects/test.py: line 15: syntax error near unexpected token `('
projects/test.py: line 15: `subFiles = glob.glob(fileFolder + r'*.pdf') '
liamsuma
  • 156
  • 4
  • 19
  • If your application's main goal is interacting with files on the host system, a Python virtual environment might be a better isolation mechanism than a Docker container. This will give you an isolated path to install local Python library dependencies, but it can also directly use normal host filesystem paths, and it doesn't require administrator permissions to run. – David Maze Jul 24 '20 at 15:19
  • @DavidMaze Thanks for your valuable input. As of now, we use Docker primarily for the purpose of implementing with Azure Service Bus at a later stage. I will definitely look into this option if it's reasonable to do so. – liamsuma Jul 24 '20 at 15:28

2 Answers2

0

The issue is that your script is looking in /Users/xyz, but you mounted the volume as /projects.

I suggest modifying the script to take a command-line argument, the path to the directory where it finds the PDFs, and then you do

CMD ["python","test.py", "/projects"]
Itamar Turner-Trauring
  • 3,430
  • 1
  • 13
  • 17
  • Thanks for your response. I tried to change `CMD` as per your suggestion but still couldn't have `hello Chase` printed. Is `bash` the right command to use? Is there a way to use `TestVolume` that I created? – liamsuma Jul 24 '20 at 15:12
  • You need to change the Python script too. THis line "fileFolder = '/Users/xyz/Desktop/docker_test_app/src/' # host filesystem " should be "fileFolder = sys.argv[1]" (and you'll need to import sys too). – Itamar Turner-Trauring Jul 24 '20 at 15:17
  • Are we running the same command when mounting the volume then? It raises `IndexError: list index out of range` with `fileFolder = sys.argv[1]` – liamsuma Jul 24 '20 at 15:45
  • `sys.argv` only has 1 string in the list - `['/Users/xyz/Desktop/docker_test_app/src/test.py']` – liamsuma Jul 24 '20 at 18:00
  • Did you also change to `CMD ["python", "test".py", "/projects"]`? And pass the argument when you run it manually? – Itamar Turner-Trauring Jul 25 '20 at 20:21
  • Yes. I changed to `CMD ["python", "test.py", "/projects"]` as you suggested with modifications in python script. – liamsuma Jul 27 '20 at 13:21
  • Besides the changes that you recommended, I also added shebang `#!/usr/bin/env python3` on top of the script to be executable – liamsuma Jul 27 '20 at 13:38
  • I have edited the OP with error reproduced from running `docker exec -it nifty_keller bash projects/test.py` manually. – liamsuma Jul 27 '20 at 16:34
0

Here is the solution that is currently working for my case. Instead of using bash, python3 would be suffice.

Dockerfile:

FROM python:3.7-stretch
WORKDIR /app
COPY src/requirements.txt ./ 
RUN pip3 install --upgrade pip
# install dependencies 
RUN pip3 install -r requirements.txt # has to be pip3 install instead of pip install 
RUN pip3 install PyMuPDF
RUN pip3 install --upgrade PyMuPDF # need to upgrade PyMuPDF to bypass fitz module issue 
# bundle app source 
COPY src /app
EXPOSE 8080
CMD ["python3", "test.py"]

No changes to be made in test.py

Docker commands:

docker run -v /Users:/Users python3 python3 ~/Desktop/docker_test_app/src/test.py

After running the command, a container was created with python3 image and from there we can do docker start <container name> etc.

liamsuma
  • 156
  • 4
  • 19