Docker: issue when using entrypoint+cmd but not when using just cmd

Question

I want to preface this by saying this is not a general question about the difference between cmd and entrypoint+cmd. I thought I understood the general difference and how to use them but I encountered possibly a more nuanced issue with entrypoint+cmd.

I was trying to write a simple image (call this image2) that pulls from another image (call this image1) which basically contains my environment. The purpose of this was that the environment is pretty static but I might want to make nuanced changes to the container that runs the code. The image I was having issues with looks like this:

FROM image1

ENTRYPOINT [ "/opt/conda/bin/python" ]
CMD [ "/tmp/script.py" ]

I wanted to write it this way to restrict the purpose of this container (running a python script). This however would throw an error when I ran it outside the container. It would start the script and run for a bit, but when it would get to some Pyspark code it would result in this:

java.io.IOException: Cannot run program "python3": error=2, No such file or directory

Pyspark was suddenly looking to use python3 but I'm not sure why it started looking for that.

However, if I change the Dockerfile to the following:

FROM image1

CMD /opt/conda/bin/python /tmp/script.py

Then it runs fine without error. So I'm wondering if someone can explain why I'm able to do my script with CMD alone but not with ENTRYPOINT.

Not an answer to your question, but I don't feel like `ENTRYPOINT ["python"]` really makes sense. The `CMD` can still be any program on the system, but only if it's implemented in Python, and you still need to repeat the script name if you're overriding the command somewhere. I might [make the script executable](https://docs.python.org/3/tutorial/appendix.html#executable-python-scripts) so you don't need to explicitly say `python` at all, at least in the startup sequence. — David Maze, Apr 29 '23 at 10:44
Can you elaborate on "and you still need to repeat the script name if you're overriding the command somewhere"? I don't think I fully understand this. The idea was I wanted to limit the container to running python scripts. But I'm not very committed to this idea and might go with what you're saying — Ken Myers, Apr 29 '23 at 18:42
I think I got it working but I have to specify the full path like `docker run ... /tmp/script.py`. if I try to just pass it `script.py` it errors out with `/usr/local/bin/_entrypoint.sh: line 24: exec: script.py: not found`. Do you know if there's a way to just pass it the file name which would be located in /tmp/? — Ken Myers, Apr 29 '23 at 19:10

score 2 · Accepted Answer · answered Apr 29 '23 at 06:29

Your dockerfile is fine... Though, there is a difference between CMD arg1 arg2 and CMD ["arg1", "arg2"] (with brackets), so that would at least explain some difference.

when it would get to some Pyspark code

You can set ENV PYSPARK_PYTHON=/opt/conda/bin/python to change the interpreter Spark uses.

Docker: issue when using entrypoint+cmd but not when using just cmd

1 Answers1