0

The Problem

You have several node processes running in some cluster, one per core, but your cluster has several VMs, so many processes are running. Each process is independent and you're loading balancing them so their average workloads are similar.

You're running postgres, and you want to take best advantage of connection pooling, as the cost of establishing connections is really high otherwise.

  1. You want to ensure you don't overload postgres with too many connections.
  2. You want to ensure the cluster is highly available even during low load times (i.e. you don't want a customer to incur the slowness of connection establishment if use of the service was previously quite idle).

Proposed Solutions

  1. Use a per-node pod connection pool: node-pg-pool or a sidecar PGBouncer setup (if you're using docker or k8s or something like that).
  2. Use a connection pooling service that all node processes in the cluster connect to.

Is it okay to keep several connections idling across the cluster all the time to ensure high availability? Are there any particulars one should keep in mind so that this solution scales effectively?

SirensOfTitan
  • 799
  • 1
  • 7
  • 19
  • Does each VM have its own app server and own database server, or do all VMs point to one database? – jjanes Nov 11 '19 at 17:32
  • "(i.e. you don't want a customer to incur the slowness of connection establishment if use of the service was previously quite idle" Is that honestly a problem? How much latency does it add? In my hands it takes about 0.01 seconds to connect. If you do that once an hour, is that really a big deal? – jjanes Nov 11 '19 at 17:39
  • All VMs point to one database. We're seeing huge latency differences when trying to connect (like on a page load with several query executions like 1s loading differences, and we were pooling connections before locally), although to that question I haven't spent enough time asking why (My initial thought is perhaps DNS with node). – SirensOfTitan Nov 11 '19 at 22:58
  • Apparently Azure has *awful* connection performance: https://stackoverflow.com/questions/57094912/long-connection-time-with-azure-postgres – SirensOfTitan Nov 12 '19 at 15:31
  • If it is a DNS issue, then that seems to rule out option 2, as it would suffer the same problem, wouldn't it? What I've done in the past it just make a cron job run a DNS lookup every minute, to keep the DNS cache warm or whatever the issue was. (Better to tackle the underlying issue, but that would be someone else's job and they weren't getting it done) – jjanes Nov 12 '19 at 15:51

1 Answers1

0

Keeping a few dozen truly idle (not 'idle in transaction') connections around all the time should not be a problem. A few hundred would be more concerning.

One thing that might be a problem would be if idle connections get severed by some gateway or firewall, then when the app goes to checkout a connection the pool would test it, find it invalid, and need to open a new one anyway. So if you can configure your pooler to periodically test idle connections (I don't know if/how node-pg-pool can do that) that might be a good idea.

jjanes
  • 37,812
  • 5
  • 27
  • 34