The Zalando Postgres Operator has a built in possibility to maintain a PgBoucer setup for your Postgres cluster. But what is PgBouncer and why should you consider using it?
PgBouncer is a so called connection pooler for Postgres databases. This means your application / client can connect to it, like if it is a Postgres database. PgBouncer will then create or manage the actual connection to the Postgres database. This brings a major benefit, creating a Postgres connection is a quite time consuming action and when your application does several hundret connects and disconnects per minute, it will cost you quite some time. Another upside when running PgBouncer on a Kubernetes cluster alongside a Zalando Operator managed Postgres cluster is, that you can use PgBouncer running behind and act itself as a loadbalancer (at least for your replica Postgres instances). One last nice thing is, that you can inject your signed SSL certificates not only to your Postgres Kubernetes pod, but also to the PgBouncer pod(s) that the operator will create. Meaning, PgBouncer will handle SSL connections for you and use your valid certificates.
How it works
Let dive a bit deeper in the work principals of PgBouncer, especially when using it with the Zalando Operator or the Zalando PgBouncer container image respectively. The Zalando PgBouncer image has some logic implemented on how some PgBouncer parameters are calculated or can be configured. If this hinders you, then you still can use another PgBouncer container image, but this is not covered in this article.
Let’s have a look into some of the parameters that deviate from PgBouncer default when the Zalando Operator manages your pooler pods.
pool_mode
: There are three pool modes in PgBouncer.session
,transaction
andstatement
. It controls how PgBouncer manages your connection pools. The Zalando default istransaction
which means, a database session is opened by PgBouncer (if none exist), a transaction will run on the database. When it’s finished up, the connection will be left intact to the Postgres database but will be marked as free and go back into the PgBouncer connection pool. Another client connection then can use the same (existing) database connection from the pool, to run another transaction and so on. Pools are managed as one pool per user / database pair.default_pool_size
: The Zalando Operator calculates this parameter value as follows. It takes the operator parameterconnection_pooler_max_db_connections
(default60
), divides it by the number of PgBouncer instances (default2
) and divides it again by 2. So if you leave everything by default, you will end up with adefault_pool_size
of15
. It controls how many server connections are allowed per user/database pair (pool).min_pool_size
: This parameter is explicitly commented out in thepgbouncer.ini
template of the container image. It seems as there was some fiddling with it in the past and there might be some more in the future. That’s why I mentioned it even if the PgBouncer default of0
applies now. It calculated asdefault_pool_size
divided by 2 by the Zalando Operator, but as mentioned, without effect.max_client_conn
: This value is hard coded with10000
. This means, a PgBouncer instance (so one pod) can not handle more than 10000 client connections. This is more or less to be seen as “unlimited”, since the resource sizing of the PgBouncer pods is quite limited and all sources I’ve checked recommend to “scale out” to a higher number of pods.- max_db_connections: The Operator calculates this as
connection_pooler_max_db_connections
(default60
), divides it by the number of PgBouncer instances (default2
). It controls the limit that PgBouncer will use as per database connection limit. So independent from the pool(s), there can’t be more Postgres connections opened.
A picture tells more than thousand words, so I drawed one for you which should show you, how PgBouncer works when running in transaction
pool mode:
Authentication
Since PgBouncer “intercepts” the connection between a client and the Postgres instance, it has to handle also the authentication. There are several parameters and configuration options you can use with PgBouncer by default. Zalando uses a database user called pooler
as auth_user
and a custom SECURITY DEFINER
function to check if the requested client connection has a valid user / password combination to logon to the database.
Configuration
This is really the easy part of the story. Assuming that you are fine with the Postgres Operator default configuration (you can configure them by the postgres-operator
configmap), you can start creating a postgresql
CRD which can look like this:
apiVersion: "acid.zalan.do/v1"
kind: postgresql
metadata:
name: postgres-demo-cluster-pooler
namespace: postgres
spec:
teamId: "postgres"
volume:
size: 2Gi
numberOfInstances: 2
users:
demouser: # database owner
- superuser
- createdb
databases:
demo: demouser # dbname: owner
postgresql:
version: "14"
enableConnectionPooler: true
enableMasterPoolerLoadBalancer: true
allowedSourceRanges: # load balancers' source ranges for both master and replica services
- 192.168.0.0/16
connectionPooler:
numberOfInstances: 2
mode: "transaction"
schema: "pooler"
user: "pooler"
resources:
requests:
cpu: 300m
memory: 100Mi
limits:
cpu: "1"
memory: 100Mi
This will create us a Postgres cluster consisting of two replicas and also two PgBouncer pods which point to the primary Postgres cluster node. With the envableMasterPoolerLoadbalancer
parameter configured, a Kubernetes service of type Loadbalancer
will be created which gives you the accessibility to your Postgres / PgBouncer from outside the Kubernetes cluster.
See here for a complete example of the postgresql
configuration options.
Drawbacks
- Zalando has decided in its implementation of PgBouncer, to use a function within each database to handle authentication of PgBouncer. The Zalando Operator will take care that this function is created on every database which the Zalando Operator manages. But if you create or have created a database “outside” of the Zalando Operator (e.g. using
CREATE DATABASE...
), the function will not be created. You have to create it on your own. Without this function, the configuredauth_query
can not be executed, hence, no authentication over PgBouncer is possible. A better option in my opinion would have been, to create a dedicatedpooler
database with this respective function and using the PgBouncer config parameterauth_database
to always use this database for handling the authentication requests. - By default, Zalando configures PgBouncer to use the
transaction
pool mode. This leaves prepared statements not working. A client which relies on prepared statements (e.g. PGAdmin uses it) will not work correctly using the PgBouncer connection pool. You may alter the pool mode tostatement
, but then still, you cannot configure theserver_reset_query
parameter in PgBouncer, since this is no configureable option in the Zalando PgBouncer image. See here for more details.
Philip