How do I prevent containers that use selenium from 'freezing' after getting OOM errors and seeing Failed to start thread - pthread_create failed (EAGAIN)
? What is the root cause and how do I fix it? Further, how can I test the solution locally and how can I implement the solution on AWS?

- 2,238
- 1
- 10
- 22

- 1,581
- 2
- 18
- 35
1 Answers
The following is a guide to diagnose and solve OOM issues occurring due to long lived Selenium instances. Specifically, on AWS you will likely get Failed to start thread - pthread_create failed (EAGAIN) for attributes: stacksize: 1024k, guardsize: 0k, detached.
and the dreaded OOM error should your Selenium process run long enough. This guide assumes that you are using Docker and can run your docker file via docker run <some args if you wish> your_image
.
Diagnosis
Run your app using docker run ...
. In a separate tab do docker ps
, find your Container ID
and do docker container stats CONTAINER_ID
. The key is to observe the PIDS
column. Now trigger the selenium process to run, ideally many times (to test this you may want to create a for loop simply to test. You will notice that the PIDs grows without bound. This is because (reference: Selenium leaves behind running processes?) will leave around zombie processes.
Solution
The solution is to cull the zombie processes. Specifically per Selenium leaves behind running processes? there is a flag --init
and this will be the key to the solution. You need to run docker run --init ...
(note the --init
). Per https://docs.docker.com/engine/reference/run/ "You can use the --init flag to indicate that an init process should be used as the PID 1 in the container. Specifying an init process ensures the usual responsibilities of an init system, such as reaping zombie processes, are performed inside the created container." . To be confident that the solution will work for you, run your image with docker run --init ...
. Re-trigger the calls to Selenium. This time the PIDS may grow, but not without bound (for me the number of PIDs never passed 200).
Solution - AWS
References are https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-ecs-taskdefinition-linuxparameters.html and https://www.ernestchiang.com/en/posts/2021/using-amazon-ecs-exec/
In the 'Task Definition' (search for 'ECS' -> click on Task Definitions
) select your task definition. Then scroll to the bottom and click on Configure via JSON
. Next, find linuxParameters
and, if it is null, replace null with the value:
{
"initProcessEnabled": true
}
If you already have a JSON value for linuxParameters
then just add "initProcessEnabled": true
as a JSON parameter. Next, crate your task definition and deploy!
Solution - Other
I have not used Google Cloud or Microsoft offerings, so I do not know how to add the --init
flag. If someone with such experience could tell me how to do that, I would be happy to update the guide.

- 1,581
- 2
- 18
- 35