6

I have been trying to configure PgPool to accept a requests of about 150. Postgres server is configured to accept only 100 connections. Anything beyond 100 need to be pooled by PgPool. I don't seem to get that. I only require PgPool to queue the requests, my current configuration does not do that. From my JMeter test, when I try to get connection beyond 100, postgres gives me an error saying PSQL error: sorry, too many clients.

I only have configured PGPool with the following parameters :

listen_address = 'localhost'
port = 9999
backend_hostname0 = 'localhost'
backend_port0 = 5432
num_init_children = 100
max_pool = 4
child_life_time =120
child_max_connections = 0
connections_life_tome = 120
client_idle_limit = 0

Since I only require PgPool to Queue the extra connections requests, is the above configuration correct? Please advise on the proper configuration.

Limnic
  • 1,826
  • 1
  • 20
  • 45

2 Answers2

5

The 'child_max_connections' in pgpool is NOT the maximum allowed connections to the DB. It is the number of times a pooled connection can be used before it terminates and restarts. It is there to recycle connection threads and stop memory leaks.

The formula of max_pool x num_init_children describes the maximum number of connections that pgpool will make to Postgresql. Obviously, this needs to be less than the 'max_connections' set in postgresql, otherwise pgpool marks the DB as an unavailable backend. And if you have some DB connections reserved for admin use, you need to reduce the number of pgpool connections further.

So, what I am saying is that the 'max_connections' in the formula is the parameter set in postgresql.conf. Setting 'child_max_connections' to 100 in the comment above just means that the pgpool connection is closed and reopened every 100 times it is used.

user2606721
  • 51
  • 1
  • 1
2

The first thing is to figure out what you want as your maximum pool size. PostgreSQL performance (both in terms of throughput and latency) is usually best when the maximum number of active connections is somewhere around ((2 * number-of-cores) + effective-spindle-count). The effective spindle count can be tricky to figure -- if your active data set is fully cached, count it as zero, for example. Don't count any extra threads from hyperthreading as cores for this calculation. Also note that due to network latency issues, you may need a pool slightly larger than the calculated number to keep that number of connections active. You may need to do some benchmarks to find the sweet spot for your hardware and workload.

The setting you need to adjust is child_max_connections, with num_init_children kept less than or equal to that.

kgrittn
  • 18,113
  • 3
  • 39
  • 47
  • 2*number-of-cores+effective-spindle-count ?? for the num_init_children parameter? could you give an example? i am not sure i understood.. – Murali_dharan_raju Apr 30 '12 at 06:25
  • Let's say you have a quad core processor with hyperthreading. That's four cores, so you can probably benefit from up to eight active database connections. Let's say you have a six-drive RAID 10 for data. If you are fully cached, you probably don't add anything, because the drives are not being used for a lot of random reads. If you are disk-bound, then you count that as six, getting you to 14 connections. If you are partially cached, effective cache size is probably somewhere in between. Start from there and test slightly larger and smaller pools with your workload. – kgrittn Apr 30 '12 at 12:21
  • 1
    Well this statement from the PgPool wiki helped me : - In summary, max_pool, num_init_children, max_connections, superuser_reserved_connections must satisfy the following formula: max_pool*num_init_children <= (max_connections - superuser_reserved_connections) (no query canceling needed). i changed my configuration to `listen_address = 'localhost' port = 9999 backend_hostname0 = 'localhost' backend_port0 = 5432 num_init_children = 25 max_pool = 4 child_life_time =120 child_max_connections = 100 connections_life_tome = 120 client_idle_limit = 25` – Murali_dharan_raju May 01 '12 at 06:14
  • That should prevent the errors. It is, however, more connections than PostgreSQL can efficiently use. You will be able to serve more clients with better throughput and faster response time if you reduce the number of database connections. Some people find that counter-intuitive; I recommend you test your actual mix with different sizes when you can. – kgrittn May 01 '12 at 11:52
  • m also looking into PgBouncer.. Which one would you recommend? and also how can i configure for multiple databases.? – Murali_dharan_raju May 01 '12 at 14:52
  • I haven't used either; both have their enthusiasts. I've used connection pools built in to or related to the client software. Where a good client-side pooler is available I think it is generally better than using a generalized tool. – kgrittn May 01 '12 at 18:06
  • Well now, seems like most statistics online say PGBouncer is the best, and can do pooling efficiently. i think i am going to with PGBouncer!! – Murali_dharan_raju May 07 '12 at 11:21