2

We have a Django application that has to consume a third-party API periodically to fetch a large amount of data for a set of users. The tasks are performing fine and fullfill their purpose. But after a period of time, we start getting too many connection errors from postgres

FATAL: sorry, too many clients already

Info

The project is dockerized and all components are running in separate containers including the app and the database (postgres). The periodic tasks are performed with dramatiq and scheduled by periodiq. We also use redis as the system broker.

I have tried various workarounds to make it stop but none of them worked including various solutions proposed here in SO.

Attempt 1

I have used connection.closes() before and after each task execution to make sure no ghost connections are left open by the workers.

Attempt 2

Add a task limiter in order to limit the number of active connections at a given time and prevent the database from being overwhelmed. While this solution is not even serving the actual scope of our implementation as it obviously, reduces the performance of the execution of the task It did not help with the problem.

Attempt 3

Increase the pool limit for the Postgres. As proposed here I have added a custom configuration file to increase the available pooling. This had an effect but it only postponed the show of the error, it did not avoid this from happening as expected. I even reached very high limits of 10K connections (from the default 10). I post the configuration file here in case it helps.

Note The app runs on premise on a server with 24 cores and 128GB of RAM and it does not use more than 1% of the resources while performing the tasks.

max_connections = 100000
shared_buffers = 64GB

Attempt 4

I have inserted pgpool to the project in order to queue requests to the db. This one prevented the db from being overwhelmed but it was not a practical solution as it caused the db connections to wait forever and that made the db usuable as well.

Attempt 5

Use CONN_MAX_AGE=0 parameter to prevent Django from creating persistent connections. That had no effects as well.

Attempt 6

Try to make tasks run on an atomic connection block. That did not seem to help as well.


I think that the way the tasks are being executed in parallel threads on the dramatiq worker causes the connections to stay open but idle. I tried to close the connections manually from the dramatiq and periodiq containers but this did from little to no good at fixing the connections pool issue.

I tried all the variants I found on SO.

# Command 1
connection.close

# Command 2
for c in connections.all():
    c.close()

# Command 3
close_old_connections()
dstrants
  • 7,423
  • 2
  • 19
  • 27

0 Answers0