Pgbouncer: how to run within a kubernetes cluster properly

Question

The background: I currently run some kubernetes pods with a pgbouncer sidecar container. I’ve been running into annoying behavior with sidecars (that will be addressed in k8s 1.18) that have workarounds, but have brought up an earlier question around running pgbouncer inside k8s.

Many folks recommend the sidecar approach for pgbouncer, but I wonder why running one pgbouncer per say: machine in the k8s cluster wouldn’t be better? I admit I don’t have enough of a deep understanding of either pgbouncer or k8s networking to understand the implications of either approach.

EDIT:

Adding context, as it seems like my question wasn't clear enough.

I'm trying to decide between two approaches of running pgbouncer in a kubernetes cluster. The PostgreSQL server is not running in this cluster. The two approaches are:

Running pgbouncer as a sidecar container in all of my pods. I have a number of pods: some replicas on a webserver deployment, an async job deployment, and a couple cron jobs.
Running pgbouncer as a separate deployment. I'd plan on running 1 pgbouncer instance per node on the k8s cluster.

I worry that (1) will not scale well. If my PostgreSQL master has a max of 100 connections, and each pool has a max of 20 connections, I potentially risk saturating connections pretty early. Additionally, I risk saturating connections on master during pushes as new pgbouncer sidecars exist alongside the old image being removed.

I, however, almost never see (2) recommended. It seems like everyone recommends (1), but the drawbacks seem quite obvious to me. Is the networking penalty I'd incur by connecting to pgbouncer outside of my pod be large enough to notice? Is pgbouncer perhaps smart enough to deal with many other pgbouncer instances that could potentially saturate connections?

Kubernetes services don't have a concept of "local first" so it's harder to route traffic to a local service outside of the Pod. You get to choose between network latency or fault tolerance (and maybe some complexity to fix the fault tolerance). — Matt, Feb 09 '20 at 01:52
If you could add some information about why and how your application uses PgBouncer and the apps deployment structure it might guide the answer a bit more. — Matt, Feb 09 '20 at 01:52
@willrof: sorry, just updated! I realize I was incredibly unclear with my initial question. I'm not asking why I cannot run pgbouncer on a node, but instead why I would choose to run it as a sidecar on my pods over running a pgbouncer deployment with replicas >= to nodes in the cluster. I'm worried about connection saturation in the former case as # of replicas grow. — SirensOfTitan, Feb 18 '20 at 17:07

score 15 · Accepted Answer · answered Mar 24 '20 at 23:04

We run pgbouncer in production on Kubernetes. I expect the best way to do it is use-case dependent. We do not take the sidecar approach, but instead run pgbouncer as a separate "deployment", and it's accessed by the application via a "service". This is because for our use case, we have 1 postgres instance (i.e. one physical DB machine) and many copies of the same application accessing that same instance (but using different databases within that instance). Pgbouncer is used to manage the active connections resource. We are pooling connections independently for each application because the nature of our application is to have many concurrent connections and not too many transactions. We are currently running with 1 pod (no replicas) because that is acceptable for our use case if pgbouncer restarts quickly. Many applications all run their own pgbouncers and each application has multiple components that need to access the DB (so each pgbouncer is pooling connections of one instance of the application). It is done like this https://github.com/astronomer/airflow-chart/tree/master/templates/pgbouncer

The above does not include getting the credentials set up right for accessing the database. The above, linked template is expecting a secret to already exist. I expect you will need to adapt the template to your use case, but it should help you get the idea.

We have had some production concerns. Primarily we still need to do more investigation on how to replace or move pgbouncer without interrupting existing connections. We have found that the application's connection to pgbouncer is stateful (of course because it's pooling the transactions), so if pgbouncer container (pod) is swapped out behind the service for a new one, then existing connections are dropped from the application's perspective. This should be fine even running pgbouncer replicas if you have an application where you can ensure that rarely dropped connections retry and make use of Kubernetes sticky sessions on the "service". More investigation is still required by our organization to make it work perfectly.

One way to get by with running a single replica would be to put in a poddisruptionbudget and set minAvailable to 1. It's hacky but it would prevent the pod from being rescheduled thus protecting it from autoscaling operations. However, this also might mean that an unhealthy node causes the pod to not reschedule. — Derek Williams, Jul 30 '21 at 15:48
Did you ever find a way to make this robust against pod restarts? We have a similar setup in our kubernetes setup, but we only use it for background processes and not pods that are user facing, like those handling web requests, because occasionally the node its on will die and result a slew of dropped connections. Not the end of the world, but it would be nice to not have thousands of errors from dropped connections come it and it would be nice to make it robust enough to use for web requests as well. — mscrivo, Mar 09 '22 at 19:02
We moved chart to OSS apache/airflow repo: https://github.com/apache/airflow/tree/main/chart/templates/pgbouncer — Andrii Soldatenko, Jun 10 '22 at 15:11

Pgbouncer: how to run within a kubernetes cluster properly

1 Answers1