43

I'd like to better understand Connection Pooling when using Npgsql for Postgres. (http://www.npgsql.org/)

When I use the connection string:

UserID=root;Password=myPassword;Host=localhost;Port=5432;Database=myDataBase;Pooling=true;Minimum Pool Size=0;Maximum Pool Size=100;

Where is the "Pooling" going to take place? On my application server or on the database?

When I call connection.Open(), what happens? Is a connection taken from the pool if one exists and if not, a pool is created?

Any other general info around Connection Pooling would be appreciated.

Thanks.

Chris
  • 3,113
  • 5
  • 24
  • 46
  • Min and Max Pool Size are now named: "Minimum Pool Size" and "Maximum Pool Size" http://www.npgsql.org/doc/connection-string-parameters.html#pooling – mpiliszcz Feb 26 '21 at 09:49

1 Answers1

65

Npgsql connection pooling is implemented inside your application process - it has nothing to do with PostgreSQL, which is completely unaware of it.

The mechanism is very simple. When you close a pooled connection, instead of physically closing the connection to PostgreSQL the physical connection is kept around idle in memory (in a "pool"). The next time you open a new connection, if its connection string matches a physical connection already present in the pool, that physical connection is reused instead of opening a new physical connection.

Since opening/closing physical connections is an expensive process, this considerably speeds up your application.

Shay Rojansky
  • 15,357
  • 2
  • 40
  • 69
  • When using pooling in a web application, when the session is closed, does the pool still stay open? – Daniel Gee Oct 03 '18 at 12:07
  • Database connection pooling has nothing to do with web sessions. When you close/dispose NpgsqlConnection, the physical connection gets returned to the pool and can be reused when someone else calls Open(). – Shay Rojansky Oct 03 '18 at 12:35
  • @ShayRojansky, I've set Pooling=True in my connection string and I still see hundreds of instances of postgres.exe Running and Terminated every second on the server. Should the pooling not stop that from happening? – Daniel Gee Oct 03 '18 at 13:10
  • That's right, it should not be happening - assuming your application isn't restarting or itself spawning processes which connect (the pool is internal to you're process). Make sure that's the case. – Shay Rojansky Oct 03 '18 at 13:13
  • Could you please give me some advice on how to look for that? I have a web application in which I use one class to work with the database. For every query I open the connection, run the query and then close the connection. – Daniel Gee Oct 03 '18 at 13:20
  • @ShayRojansky, also, once I switch on pooling, within 20 minutes or so, I get this error hundreds of times a minute: The connection pool has been exhausted, either raise MaxPoolSize (currently 100) or Timeout (currently 15 seconds) – Daniel Gee Oct 03 '18 at 13:25
  • Seems default is set to True 1/100 so there's no need to add those parameters right? https://www.npgsql.org/doc/connection-string-parameters.html – Dan Parker Mar 12 '19 at 18:36
  • Could you tell what happens if all the connections in the pool are occupied by long-running queries? Is it possible to open one more connection immediately, or we wait for a released one? Will it be added to the pool? – Ilya Chumakov Apr 01 '19 at 10:40
  • Well, once you reach the maximum pool size, and all connections are busy, then attempting to open another connection will block until one of them is released (that's the point of having a maximum pool size). – Shay Rojansky Apr 01 '19 at 13:19
  • @ShayRojansky I think that when you refer to "physical connection" you are making a reference to a network connection (like a TCP socket connected from client to PostgreSQL Server). They are not really "physical connections" like a USB, Serial or Parallel connection that a computer may have with a device (a USB camera for example). – Ignacio Jun 01 '19 at 23:36
  • 3
    @ignacio yes, it's understood within the context of this conversation that "physical connection" refers to a TCP socket connection (which is very expensive to create) as opposed to a pooled connection (which is very lightweight). – Shay Rojansky Jun 02 '19 at 03:04
  • 4
    @ShayRojansky thank you for your reply and please, let me congratulate you because of the great job you do with Npgsql. If anyone want to go deeper in how a connection is build can check this link https://www.postgresql.org/docs/11/protocol-flow.html . That link describes the process from the PostgreSQL server point of view but we can get an idea of why opening a connection is expensive. We have to also add the tasks the operating systems have to do to open, maintain and close the TCP connection (through sockets). – Ignacio Jun 02 '19 at 13:10
  • @ShayRojansky Say, we have one web server connects to multiple DB servers and we are setting the max pool size to 100. Does it mean that, connection to each DB server can have maximum 100 connections in the pool? Total number of pool connection the web server can go up to 100 * Number of DBs ? – Kurubaran Oct 07 '19 at 08:04
  • 2
    That's right. Each connection string has its own pool, and the MaxPoolSize is part of that connection string, so also part of that pool's setting. Different pools (and connection strings) can have different maximum pool sizes. So you may have DB1 with MaxPoolSize=5, and DB2 with MaxPoolSize=100, coming up to a total have 105 physical connections in total. – Shay Rojansky Oct 07 '19 at 10:18
  • 3
    What happens to "set" variables https://www.postgresql.org/docs/current/sql-set.html? Are they unique or re-used from the pool? Say I open a connection connection1 and set a var to "val1". Then I open 2 other connections, both of them re-use connection1. Then I set "val2" and "val3" for the variable in different instances. Is it safe or there might be a race-condition? – Serg046 Feb 17 '21 at 20:03
  • 6
    @Serg046 by default, when a physical connection is returned to the pool, a `DISCARD ALL` command is enqueued which will be executed the next time that connection is used; this resets all parameters (and other connection state) to their defaults, as if a new physical connection was opened. This prevents state leakage through the pool. – Shay Rojansky Feb 18 '21 at 08:28
  • Thank you @ShayRojansky, I guess if I try to re-use conn1 by new conn2 and conn3, then I actually re-use it only by conn2 and conn3 will create a new connection as conn1 is already held by conn2. – Serg046 Feb 18 '21 at 15:07
  • I'm not sure what you mean by conn1 being held by conn2... Every time you call NpgsqlConnection.Open an idle physical connection will be taken out of the pool, and when you call Close/Dispose it will be returned, after making sure its state is reset when next used. – Shay Rojansky Feb 18 '21 at 16:16
  • 1
    @ShayRojansky Is `DISCARD ALL` still being done in current versions of npgsql? I'm seeing advisory locks being held by connections in the pool, so apparently `DISCARD ALL` is not done. I'm using the connection through EF Core DbContext database facade, not directly. The connection is made persistent for the lifetime of the DbContext by explicitly calling `OpenConnection()`. – Boris B. Jan 11 '22 at 12:24
  • Yes, DISCARD ALL is done unless you have `No Reset On Close=true` in your connection string. If you're seeing something else, please open an issue with a minimal code sample. – Shay Rojansky Jan 12 '22 at 10:30
  • That's the nice answer. But what is the relationship between `Appsetings` -> `Maximum Pool Size=200;` with `PostgreSQL` --> `SHOW max_connections;`?. btw +1 from my respect. @ShayRojansky – Nguyễn Văn Phong May 04 '22 at 02:05
  • PG max_connections is the max number of physical connection PG will allow - server-side. Npgsql's Max Pool Size is a client-side setting for the pool, determining how many physical connections it may attempt to hold at any given point (attempts to open more would block until other connections are released). – Shay Rojansky May 04 '22 at 10:42
  • Concretely, your PG max_connections should be at least the sum of all Npgsql Max Pool Size settings across *all* clients. In other words, two app servers with Max Pool Size=50 may require up to 100 total connections on the PG side. – Shay Rojansky May 04 '22 at 10:42
  • Btw, Would I add your useful comment on [my post](https://stackoverflow.com/a/71948075/9071943)? Thanks @ShayRojansky – Nguyễn Văn Phong May 13 '22 at 04:10
  • @ShayRojansky If each app server has say 4 DbContexts registered, and max pool size is 300 in connection string passed to each dbcontext, would this mean each app server can pool 1200 connections or just 300 connections. – adnan kamili May 26 '22 at 19:07
  • @ShayRojansky provided connection strings only differ in DB name, DB server is same – adnan kamili May 27 '22 at 05:06
  • Assuming each DbContext has a distinct connection string (e.g. since DB name is different), then yeah - each app server can use up to 1200 connections (which seems like a lot). Each connection string gets its own pool. – Shay Rojansky May 28 '22 at 18:09
  • @ShayRojansky When you refer to the process of "opening a connection", do you mean 'var conn = new NpgsqlConnection(s); conn.Open();' or just 'conn.Open()'? In other words, is the act of constructing a NpgsqlConnection an expensive or cheap operation? – unique_ptr Sep 20 '22 at 08:17
  • 2
    Constructing an NpgsqlConnection is extremely cheap, it's a trivial object allocation. conn.Open is what actually opens it; that's also extremely cheap if pooling is on and an idle connection is available. It's only expensive it you need to open a new physical connection, which should be a generally rare thing. – Shay Rojansky Sep 20 '22 at 19:57