0

I made the Dockerfile for making Docker image that runnable from AWS Batch, contains multiple layers, copy files to '/opt', which I set it as WORKDIR.

I have to run a program called 'BLAST', which is a single .exe program, requires several parameters including the location of DB.

When I run the image, the error comes out with it cannot find the mounted DB location. Full error message is b'BLAST Database error: No alias or index file found for nucleotide database [/mnt/fsx/ntdb/nt] in search path [/opt:/fsx/ntdb:]\n'] where /mnt/fsx/ntdb/nt is the DB path.

The only assumption is because I gave WORKDIR in my Dockerfile so the default workspace is set as '/opt:'.

I wonder how should I fix this issue. By removing WORKDIR ? or something else?

My Dockerfile looks like below

# Set Work dir
ARG FUNCTION_DIR="/opt"

# Get layers
FROM (aws-account).dkr.ecr.(aws-region).amazonaws.com/uclust AS layer_1
FROM (aws-account).dkr.ecr.(aws-region).amazonaws.com/blast AS layer_2
FROM public.ecr.aws/lambda/python:3.9

# Copy arg and set work dir
ARG FUNCTION_DIR
COPY . ${FUNCTION_DIR}
WORKDIR ${FUNCTION_DIR}

# Copy from layers
COPY --from=layer_1 /opt/ .
RUN true
COPY --from=layer_2 /opt/ .
RUN true
COPY . ${FUNCTION_DIR}/
RUN true

# Copy and Install required libraries
COPY requirements.txt .
RUN true
RUN pip3 install -r requirements.txt
# To run lambda handler
RUN pip install \
        --target "${FUNCTION_DIR}" \
        awslambdaric
# To run blast
RUN yum -y install libgomp

# See files inside image
RUN dir -s

# Get permissions for files
RUN chmod +x /opt/main.py
RUN chmod +x /opt/mode/submit/main.py

# Set Entrypoint and CMD
ENTRYPOINT [ "python3" ]
CMD [ "-m", "awslambdaric", "main.lambda_handler" ]

Edit: Further info I found, When looking at the error, the BLAST program trying to search db at the path /opt:/fsx/ntdb:, which is the combination of path set as WORKDIR in Dockerfile and BLASTDB path set by os.environ.['BLASTDB'] (os.environ['BLASTDB'] description.).

Ludacia
  • 59
  • 1
  • 9
  • The error message seems to indicate a problem unrelated to Docker or Python. Where does the database come from and how is its index created? – tripleee Oct 12 '22 at 04:24
  • `WORKDIR` does exactly what it says; perhaps see also [What exactly is current working directory?](https://stackoverflow.com/questions/45591428/what-exactly-is-current-working-directory/66860904) You can easily override it at runtime with `docker run -w` – tripleee Oct 12 '22 at 04:25
  • @tripleee The database is located at AWS's file system (luster) and it's mounted on AWS Batch's job definition. So whenever job created, the EC2 instance with database mounted created. I am not sure about the indexes for the DB since it's not created by me. I only downloaded from NCBI provided and pushed to file system. – Ludacia Oct 12 '22 at 04:28
  • @tripleee I haven't thought about overriding the workdir. I do not need to send and `docker run` command since submitting AWS Batch job does it for me. But I can have a look any methods to override it. Thanks – Ludacia Oct 12 '22 at 04:33
  • 1
    Is the mount read-only? I'm guessing the tool you are using will want to create an index when it runs, but this is obviously speculative, and not really a programming question anyway. – tripleee Oct 12 '22 at 05:54

1 Answers1

0

Figured out the problem after many debug trials. So the problem was neither WORKDIR nor os.environ.['BLASTDB']. The paths were correctly defined, and the BLAST program searching [/opt:/fsx/ntdb:] was correct way according to what is says in here

  1. Current working directory (*)

  2. User's HOME directory (*)

  3. Directory specified by the NCBI environment variable

  4. The standard system directory (“/etc” on Unix-like systems, and given by the environment variable SYSTEMROOT on Windows)

The actual solution was checking whether file system is correctly mounted or not and the permission of the files inside the file system. Initially I thought file system was mounted correctly since I already tested from other Batch submit job many times, but only the mount folder is created, files were not exist. Therefore, even though the program tried to find the index file, it could not find any so the error came out.

Ludacia
  • 59
  • 1
  • 9