Looking for some guidance from people with practical GCR experience. How do you get on with this? I run a Docker container (approx. 670mb in size) in Google Cloud Run, inside is my Python server based on Flask and it is currently ran by this command in the Dockerfile:
CMD exec gunicorn --bind 0.0.0.0:8080 --reload --workers=1 --threads 8 --timeout 0 "db_app.app:create_app()"
Say I will need to serve about 300 requests per hour.
How many workers, threads, should I specify in my exec command to use the GCR's capabilities most effectively?
For example basic configuration of GCR server is something like 1 CPU 1gb of RAM.
So how should I set my Gunicorn there? Maybe I should also use --preload
? specify worker-connections
?
As Dustin cited in his answer (see below), official Google docs suggest to write this in the Dockerfile:
# Run the web service on container startup. Here we use the gunicorn
# webserver, with one worker process and 8 threads.
# For environments with multiple CPU cores, increase the number of workers
# to be equal to the cores available.
CMD exec gunicorn --bind :$PORT --workers 1 --threads 8 --timeout 0 main:app
I've no idea about how many cores they have on that "1 CPU" in the GCR configuration, so I doubt this example code is very accurate, it's more likely to be there to just demonstrate how it works in general. So I would be (and everyone in my situation would) very grateful if someone who has a working Gunicorn server packed into a container in Google Cloud Run could share some info about how to properly configure it - basically what to put into this Dockerfile CMD
line instead of the generic example code? Something more real-life-proof.
I think this is a software problem, cuz we're talking about writing things in Dockerfile (question was closed and marked as "not SO scope question").