Skip to content

Handle this – PostgreSQL PgBouncer setup with Zalando Postgres Operator

The Zalando Postgres Operator has a built in possibility to maintain a PgBoucer setup for your Postgres cluster. But what is PgBouncer and why should you consider using it?

PgBouncer is a so called connection pooler for Postgres databases. This means your application / client can connect to it, like if it is a Postgres database. PgBouncer will then create or manage the actual connection to the Postgres database. This brings a major benefit, creating a Postgres connection is a quite time consuming action and when your application does several hundret connects and disconnects per minute, it will cost you quite some time. Another upside when running PgBouncer on a Kubernetes cluster alongside a Zalando Operator managed Postgres cluster is, that you can use PgBouncer running behind and act itself as a loadbalancer (at least for your replica Postgres instances). One last nice thing is, that you can inject your signed SSL certificates not only to your Postgres Kubernetes pod, but also to the PgBouncer pod(s) that the operator will create. Meaning, PgBouncer will handle SSL connections for you and use your valid certificates.

How it works

Let dive a bit deeper in the work principals of PgBouncer, especially when using it with the Zalando Operator or the Zalando PgBouncer container image respectively. The Zalando PgBouncer image has some logic implemented on how some PgBouncer parameters are calculated or can be configured. If this hinders you, then you still can use another PgBouncer container image, but this is not covered in this article.

Let’s have a look into some of the parameters that deviate from PgBouncer default when the Zalando Operator manages your pooler pods.

  • pool_mode: There are three pool modes in PgBouncer. session, transaction and statement. It controls how PgBouncer manages your connection pools. The Zalando default is transaction which means, a database session is opened by PgBouncer (if none exist), a transaction will run on the database. When it’s finished up, the connection will be left intact to the Postgres database but will be marked as free and go back into the PgBouncer connection pool. Another client connection then can use the same (existing) database connection from the pool, to run another transaction and so on. Pools are managed as one pool per user / database pair.
  • default_pool_size: The Zalando Operator calculates this parameter value as follows. It takes the operator parameter connection_pooler_max_db_connections (default 60), divides it by the number of PgBouncer instances (default 2) and divides it again by 2. So if you leave everything by default, you will end up with a default_pool_size of 15. It controls how many server connections are allowed per user/database pair (pool).
  • min_pool_size: This parameter is explicitly commented out in the pgbouncer.ini template of the container image. It seems as there was some fiddling with it in the past and there might be some more in the future. That’s why I mentioned it even if the PgBouncer default of 0 applies now. It calculated as default_pool_size divided by 2 by the Zalando Operator, but as mentioned, without effect.
  • max_client_conn: This value is hard coded with 10000. This means, a PgBouncer instance (so one pod) can not handle more than 10000 client connections. This is more or less to be seen as “unlimited”, since the resource sizing of the PgBouncer pods is quite limited and all sources I’ve checked recommend to “scale out” to a higher number of pods.
  • max_db_connections: The Operator calculates this as connection_pooler_max_db_connections (default 60), divides it by the number of PgBouncer instances (default 2). It controls the limit that PgBouncer will use as per database connection limit. So independent from the pool(s), there can’t be more Postgres connections opened.

A picture tells more than thousand words, so I drawed one for you which should show you, how PgBouncer works when running in transaction pool mode:

Authentication

Since PgBouncer “intercepts” the connection between a client and the Postgres instance, it has to handle also the authentication. There are several parameters and configuration options you can use with PgBouncer by default. Zalando uses a database user called pooler as auth_user and a custom SECURITY DEFINER function to check if the requested client connection has a valid user / password combination to logon to the database.

Configuration

This is really the easy part of the story. Assuming that you are fine with the Postgres Operator default configuration (you can configure them by the postgres-operator configmap), you can start creating a postgresql CRD which can look like this:

postgresql.yaml
apiVersion: "acid.zalan.do/v1"
kind: postgresql
metadata:
  name: postgres-demo-cluster-pooler
  namespace: postgres
spec:
  teamId: "postgres"
  volume:
    size: 2Gi
  numberOfInstances: 2
  users:
    demouser:  # database owner
    - superuser
    - createdb
  databases:
    demo: demouser  # dbname: owner
  postgresql:
    version: "14"
  enableConnectionPooler: true
  enableMasterPoolerLoadBalancer: true
  allowedSourceRanges:  # load balancers' source ranges for both master and replica services
  - 192.168.0.0/16
  connectionPooler:
    numberOfInstances: 2
    mode: "transaction"
    schema: "pooler"
    user: "pooler"
    resources:
      requests:
        cpu: 300m
        memory: 100Mi
      limits:
        cpu: "1"
        memory: 100Mi

This will create us a Postgres cluster consisting of two replicas and also two PgBouncer pods which point to the primary Postgres cluster node. With the envableMasterPoolerLoadbalancer parameter configured, a Kubernetes service of type Loadbalancer will be created which gives you the accessibility to your Postgres / PgBouncer from outside the Kubernetes cluster.

See here for a complete example of the postgresql configuration options.

Drawbacks

  • Zalando has decided in its implementation of PgBouncer, to use a function within each database to handle authentication of PgBouncer. The Zalando Operator will take care that this function is created on every database which the Zalando Operator manages. But if you create or have created a database “outside” of the Zalando Operator (e.g. using CREATE DATABASE...), the function will not be created. You have to create it on your own. Without this function, the configured auth_query can not be executed, hence, no authentication over PgBouncer is possible. A better option in my opinion would have been, to create a dedicated pooler database with this respective function and using the PgBouncer config parameter auth_database to always use this database for handling the authentication requests.
  • By default, Zalando configures PgBouncer to use the transaction pool mode. This leaves prepared statements not working. A client which relies on prepared statements (e.g. PGAdmin uses it) will not work correctly using the PgBouncer connection pool. You may alter the pool mode to statement, but then still, you cannot configure the server_reset_query parameter in PgBouncer, since this is no configureable option in the Zalando PgBouncer image. See here for more details.

Philip

Leave a Reply

Your email address will not be published. Required fields are marked *