9

I get the "FATAL: sorry, too many clients already" every now and then because I have a lot of idle connecions in Postgres, and I cannot understand where they are coming from or how to prevent them.

At first I tried the CONN_MAX_AGE setting in Django but it does not seem to have an effect.

I also set idle_in_transaction_session_timeout to 5min in Postgres, but I keep seeing a lot of idle transactions:

postgres=# select client_addr, state, count(*) from pg_stat_activity group by client_addr, state;
  client_addr  | state  | count 
---------------+--------+-------
               |        |     5
               | active |     1
               | idle   |     1
 172.30.12.148 | idle   |     2
 172.30.12.74  | idle   |    89
(5 rows)
postgres=# select client_addr, state, backend_start, query_start from pg_stat_activity order by query_start ;
  client_addr  | state  |         backend_start         |          query_start          
---------------+--------+-------------------------------+-------------------------------
               | idle   | 2020-03-24 20:03:16.060707+00 | 2020-03-24 20:55:17.020962+00
 172.30.12.74  | idle   | 2020-03-25 02:05:32.567976+00 | 2020-03-25 02:05:32.613112+00
 172.30.12.74  | idle   | 2020-03-25 02:05:34.926656+00 | 2020-03-25 02:05:34.945405+00
 172.30.12.74  | idle   | 2020-03-25 02:05:49.700201+00 | 2020-03-25 02:05:49.717165+00
[...]
 172.30.12.74  | idle   | 2020-03-25 04:00:51.019892+00 | 2020-03-25 04:01:22.627659+00
 172.30.12.74  | idle   | 2020-03-25 04:04:18.333413+00 | 2020-03-25 04:04:18.350539+00
 172.30.12.74  | idle   | 2020-03-25 04:04:35.157547+00 | 2020-03-25 04:05:16.746978+00
 172.30.12.74  | idle   | 2020-03-25 04:05:08.241291+00 | 2020-03-25 04:05:39.367247+00
 172.30.12.148 | idle   | 2020-03-25 04:07:02.717151+00 | 2020-03-25 04:07:02.726822+00
 172.30.12.74  | idle   | 2020-03-25 04:07:48.07922+00  | 2020-03-25 04:07:48.112819+00
               | active | 2020-03-25 04:00:10.608213+00 | 2020-03-25 04:07:57.336091+00
               |        | 2020-03-24 19:40:38.624442+00 | 
               |        | 2020-03-24 19:40:38.624876+00 | 
               |        | 2020-03-24 19:40:38.624003+00 | 
               |        | 2020-03-24 19:40:38.623479+00 | 
               |        | 2020-03-24 19:40:38.62598+00  | 
(99 rows)

I understand that Django maintains one connection per thread, but (if I can believe that snippet) I only have one:

root@omaha-server-public-565447b47c-c2nqh:/usr/src/app# python manage.py shell        
Python 3.7.1 (default, Nov 16 2018, 22:26:09) 
[GCC 6.3.0 20170516] on linux
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>> import threading
>>> for thread in threading.enumerate(): print(thread.name)
... 
MainThread
>>> 

So why, when I list connections to my DB (running at 10.100.59.225) I see so many ESTABLISHED connections ?

root@omaha-server-public-565447b47c-c2nqh:/usr/src/app# netstat -natup | grep  10.100.59.225 | wc -l
89

I am new to Django, Python and Postgres, so I guess I have overlooked something obvious, but my searches have not brought anything useful yet so I'm trying here :-)

Version information:

  • Django 2.2
    • django-cacheops 4.1
    • psycopg2 2.7.3.2
  • Postgres 12.2
  • Python 3.7.1
glatapoui
  • 363
  • 4
  • 14
  • 3
    The `shell` management command is not able to give any information about the threading in the Django server since it is its own process with its own thread(s). – Klaus D. Mar 25 '20 at 05:00
  • 1
    Looking at pg_stat_activity.query (the last query they ran before going idle) might help you to know where they are coming from. – jjanes Mar 25 '20 at 14:23
  • Thanks @KlausD., I will retry with `gdb` and edit my post with that info. – glatapoui Mar 25 '20 at 17:18
  • Based on @jjanes's suggestion, I analyzed the queries and it is +/- always the same one. I guess I narrowed it down to [a cached QuerySet](https://github.com/dentalwings/omaha-server/blob/master/omaha_server/omaha/builder.py#L79), but I still don't see why the connection is maintained. If I understand properly [Django's auto close mechanism](https://github.com/django/django/blob/2.2/django/db/__init__.py#L60) this should not happen. From what I gathered there is no direct c onnection created to the DB in the app itself, it all relies on other libs to do so (cacheops, rest_framework,...) – glatapoui Mar 25 '20 at 20:27
  • I just ran into the same problem on a fast CentOS box, a Ruby gem direct into PostgreSQL 10. Multiple clients run into the same capacity. I tested it with 12 connections (4 cores), which was recommended by a PostgreSQL text on the subject of optimization (3x core count). My queries are dirt simple and are actually returned, then stay idle, then repeat the process. I have 5 queries asking for last login. The table is the size of a test table. This sounds like a PostgreSQL issue. – Rich_F Jun 23 '20 at 23:40

3 Answers3

7

This question already exists for some time, but if anyone ever gets here, this is the problem I faced.

The development server (when you run manage.py runserver) is multi-threaded by default, which means that every request was creating its own connection, and i had a server with a pooling endpoint. I don't know if this will help anyone, but remember to check this possibility, run the server passing --nothreading to the runsever command.

https://docs.djangoproject.com/en/2.1/ref/django-admin/#cmdoption-runserver-nothreading

  • When is a connection closed? It seems once created a connection opens forever that why I faced this error while I was the only user of the app. – Tien Do Aug 05 '22 at 08:08
  • Using this option just helped me to confirm that my Django project is keeping the connection alive, as the server is totally blocked after the first request. – David Dahan Oct 13 '22 at 11:52
3

Actually you have more idle sessions than idle transaction in idle sessions: this looks like a possible connection leak on application side. PostgreSQL does not have timeout for idle sessions that don't run any transaction. A possible workaround on PostgreSQL side is to schedule a job to kill these idle sessions: see Is there a timeout for idle PostgreSQL connections?

pifor
  • 7,419
  • 2
  • 8
  • 16
  • Thanks ! I misunderstood the difference between tx and session. I would have liked to solve the leak issue rather than introduce another dependency. It seems I will have to do that, I tried the trigger but couldn't make it work: ``` postgres=# CREATE OR REPLACE FUNCTION terminate_idle_sessions() RETURNS trigger LANGUAGE plpgsql AS $function$ BEGIN SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE pid <> pg_backend_pid() AND state = 'idle' AND state_change is not null AND current_timestamp - state_change > INTERVAL '5' MINUTE; END; $function$ ; ``` – glatapoui Mar 25 '20 at 21:57
1
  • manually close those idle pg connections with two methods, it's work for me.
# core codes
from django.db import close_old_connections
from django.db import connection
close_old_connections()
with connection.cursor() as cursor:
    sql = "SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE state = 'idle'"
    print(sql)
    cursor.execute(sql)
    row = cursor.fetchall()
    print(row)
  • I use these codes in an api decorator, and if too many client in the error message, I will call these codes.
# pip install bddjango

# use the api_decorator for views that you often use. 
from bddjango import api_decorator

class Task(APIView):

    @api_decorator
    def get(self, request):
        pass

bode liang
  • 119
  • 3
  • 1
    This might be needed if it's not a full web app; Django closes old connections in normal use, but not always when you only use some pieces (eg only the ORM) – Jiří Baum Dec 28 '21 at 09:39