14

What are the differences between the django apps (Django-PostgresPool, djorm-ext-pool, django-db-pool) and PG Bouncer or PG Pool?

Do the apps use one of the last two packages?

In this article, the author says that there is a patch starting with django 1.6. Does that mean we do not have to use any of these solutions anymore, neither the apps, nor the PG Bouncer or PG Pool package?

Michael
  • 8,357
  • 20
  • 58
  • 86

2 Answers2

14

Postgres database connections are expensive (resources) compared to MySQL connections. Django pooling apps will open many connections and keep the open.

PG Bouncer and PG Pool will open fewer connections to Postgres, while maintaining a large number of local connections (app to PG Bouncer/PG Pool) and reuse them.

For best performance you want both: persistent connections from Django to PG Pool / PG Bouncer.

In our case switching persistent connections reduced average response time by 10ms (over 20%) on AWS.

c2h5oh
  • 4,492
  • 2
  • 23
  • 30
  • Are the apps doing more than "persistent connections" like it is now done with django > 1.5? I mean is there still a need of using the django apps with django > 1.5 given the patch that appeared in 1.6? Thanks – Michael Oct 20 '14 at 16:29
  • All 3 apps are trying to get the same result as PGBouncer, just in Django. That means that at best you've got pool limits per webserver, at worst per process. – c2h5oh Oct 20 '14 at 21:04
  • Ok I see. So if I have 3 gunicorn workers and 3 queues workers on the same machine, I'll have 6 connections using the apps while I can have 1 connection using PGBouncer or PGPool, correct? What if a process dies and a new one is created, will there be 6 or 7 connections? Are the connections closed automatically by PostgresSQL at some point or there is a risk to get "zombies" connection over time? – Michael Oct 20 '14 at 22:16
  • Connections time out, don't worry. If you had processes rapidly dying and being restarted it could possibly be a problem, but that's very unlikely to happen. – c2h5oh Oct 21 '14 at 00:59
  • Ok so even when using a Connection pool, the connections are not kept alive forever. Is the timeout setting set on the PostgreSQL Server or via my app? – Michael Oct 21 '14 at 13:50
  • To be honest I'm not sure. My guess is both - you can set up TCP keep-alives server side and Django will recycle connection after a predefined time (at least if you use persistent connections, I don't know if the apps you've listed change that behaviour) – c2h5oh Oct 27 '14 at 06:12
5

@c2h5oh has a great answer above. I would like to add one thing concerning the Django 1.6 update. I believe what you and the article's author are referring to is the CONN_MAX_AGE setting.

I found this question because I was searching for the same thing myself, so I'm not sure about the following, but allow me to hypothesize:

You should be able to use all three tools together:

  1. CONN_MAX_AGE (django persistent connections)
  2. django-postgrespool (pooled connections to PgBouncer)
  3. PgBouncer (pooled connections to db)

I know that #2 and #3 play nicely as evidenced by Heroku's article on connection pooling, but I'm not sure about how #1 and #2 interact.

I'm guessing the savings of using #1 and #2 together is pretty slim. Django-postgrespool is essentially designed to save connection time, but your requests still have to connect to those connections, so CONN_MAX_AGE would be saving you an aggregate of very small connection times. In addition, if you're using Heroku, CONN_MAX_AGE could possibly interfere with automatic dyno restarts (just a guess).

Note that if you're using a web server like Gunicorn, you may need to make your workers synchronous in order to prevent a connection leak.

grokpot
  • 1,462
  • 20
  • 26