I have a ECS Cluster on AWS and there are four services running under it. One of the service is a replica type with fargate launch type. It also has a load balancing associated. The OS is Linux 1.4 and number of tasks running are 2 without any auto scaling. The docker image which runs on it is a gunicorn application and the command used to run is below. And the gunicorn application is for running an API on falcon.
["gunicorn","-b","0.0.0.0:80","src.app:run()","-k","gevent","--workers=5"]
For some reason the tasks are getting stopped in every few seconds. In the logs it shows Exit Code 1, and also logs some errors in the cloudwatch.
gunicorn.errors.HaltServer: <HaltServer 'Worker failed to boot.' 3>
raise HaltServer(reason, self.WORKER_BOOT_ERROR)
[10] [ERROR] Exception in worker process
This service is running from past one year and never had this error, and suddenly it stopped working. There is no new code deployed or any development done, hence very unusual to get these errors. The service is configured to start two tasks, so it starts two and then within 2 seconds it stops and another two starts once previous ones stops. And this cycle continues. I have tried deploying the existing code base but still having the same error, I have also updated the service with new task definitions but that also did not fix.
Some additional errors from cloudwatch but does not help much.
[INFO] Starting gunicorn 19.9.0
[INFO] Listening at: http://0.0.0.0:80 (1)
[INFO] Using worker: gevent
/usr/local/lib/python3.10/os.py:1029: RuntimeWarning: line buffering (buffering=1) isn't supported in binary mode, the default buffer size will be used
return io.open(fd, mode, buffering, encoding, *args, **kwargs)
[7] [INFO] Booting worker with pid: 7
[8] [INFO] Booting worker with pid: 8
[9] [INFO] Booting worker with pid: 9
[10] [INFO] Booting worker with pid: 10
[11] [INFO] Booting worker with pid: 11
[7] [ERROR] Exception in worker process
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/gunicorn/arbiter.py", line 583, in spawn_worker
worker.init_process()
File "/usr/local/lib/python3.10/site-packages/gunicorn/workers/ggevent.py", line 203, in init_process
super(GeventWorker, self).init_process()
I have tried running the same docker in my local and getting almost the same error. Now at least the issue is narrow down to the code itself. But still not understand why it was running from years and failed just now. The detailed error is below.
[2022-09-04 08:16:27 +0000] [1] [INFO] Starting gunicorn 19.9.0
[2022-09-04 08:16:27 +0000] [1] [INFO] Listening at: http://0.0.0.0:80 (1)
[2022-09-04 08:16:27 +0000] [1] [INFO] Using worker: gevent
/usr/local/lib/python3.10/os.py:1029: RuntimeWarning: line buffering (buffering=1) isn't supported in binary mode, the default buffer size will be used
return io.open(fd, mode, buffering, encoding, *args, **kwargs)
[2022-09-04 08:16:27 +0000] [7] [INFO] Booting worker with pid: 7
[2022-09-04 08:16:27 +0000] [8] [INFO] Booting worker with pid: 8
[2022-09-04 08:16:27 +0000] [9] [INFO] Booting worker with pid: 9
[2022-09-04 08:16:27 +0000] [7] [ERROR] Exception in worker process
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/gunicorn/arbiter.py", line 583, in spawn_worker
worker.init_process()
File "/usr/local/lib/python3.10/site-packages/gunicorn/workers/ggevent.py", line 203, in init_process
super(GeventWorker, self).init_process()
File "/usr/local/lib/python3.10/site-packages/gunicorn/workers/base.py", line 129, in init_process
self.load_wsgi()
File "/usr/local/lib/python3.10/site-packages/gunicorn/workers/base.py", line 138, in load_wsgi
self.wsgi = self.app.wsgi()
File "/usr/local/lib/python3.10/site-packages/gunicorn/app/base.py", line 67, in wsgi
self.callable = self.load()
File "/usr/local/lib/python3.10/site-packages/gunicorn/app/wsgiapp.py", line 52, in load
return self.load_wsgiapp()
File "/usr/local/lib/python3.10/site-packages/gunicorn/app/wsgiapp.py", line 41, in load_wsgiapp
return util.import_app(self.app_uri)
File "/usr/local/lib/python3.10/site-packages/gunicorn/util.py", line 350, in import_app
__import__(module)
File "/src/app.py", line 1, in <module>
import falcon
File "/usr/local/lib/python3.10/site-packages/falcon/__init__.py", line 30, in <module>
from falcon.api import API # NOQA
File "/usr/local/lib/python3.10/site-packages/falcon/api.py", line 21, in <module>
from falcon import api_helpers as helpers, DEFAULT_MEDIA_TYPE, routing
File "/usr/local/lib/python3.10/site-packages/falcon/api_helpers.py", line 21, in <module>
from falcon import util
File "/usr/local/lib/python3.10/site-packages/falcon/util/__init__.py", line 29, in <module>
from falcon.util import structures
File "/usr/local/lib/python3.10/site-packages/falcon/util/structures.py", line 35, in <module>
class CaseInsensitiveDict(collections.MutableMapping): # pragma: no cover
AttributeError: module 'collections' has no attribute 'MutableMapping'
[2022-09-04 08:16:27 +0000] [7] [INFO] Worker exiting (pid: 7)
[2022-09-04 08:16:27 +0000] [10] [INFO] Booting worker with pid: 10
[2022-09-04 08:16:27 +0000] [8] [ERROR] Exception in worker process