WAL-G is a widely known backup software for Postgres databases. One of it’s features is to backup to S3 storage out of the box. Also it’s capable to encrypt those backups and WAL archives with several encryption methods (e.g. libsodium). Since WAL-G is implemented within the Zalanndo Spilo image and therefore deployed by default with the Zalando Postgres Operator on Kubernetes, I had a deeper look on how to get encrypted backups implemented there, cause this is not documented well.
WAL-G can do encryption using several methods. Which of them is described on their Github repository here. It’s all controlled by environment variables. I decided to use Libsodium for encrypting my backups since it looked the easiest way for me. I build the following instructions on top of my older guide on how to setup WAL-G backups within the Zalando Postgres Operator. If you haven’t setup backup or you are not familiar with the settings I made, best read by this article first.
As you can read in the WAL-G documentation, setting up libsodium is done by the environment variables WALG_LIBSODIUM_KEY
/ WALG_LIBSODIUM_KEY_PATH
and WALG_LIBSODIUM_KEY_TRANSFORM
. Using the first two, you can either give the key to encrypt / decrypt your backups directly (WALG_LIBSODIUM_KEY
) or specify the path to the key stored in a file (WALG_LIBSODIUM_KEY_PATH
). The last variable configures in which format the key is specified (base
, hex
or none
). I use a base64 encoded password / key here. So generating a random key is as easy as running openssl rand -base64 32
. Which will generate a 32 character long key which we can provide to WAL-G.
As you might know, there are also several ways to specify environment variables in a Zalando Postgres Operator managed environment. In the article mentioned above, I use a configmap to store all backup / restore related environment variables. Zalando Operator will attach these variables to every pod he creates. Every WALG*
prefixed variable will be added to an envdir which is again used for every wal-g
binary call. If you want to have a specific key for every Postgres cluster, you also can add the needed environment variables to every postgresql
CR under the spec.env
section. To ease this article up, I will stick with using the configmap called pod-config
to set my env vars.
apiVersion: v1
kind: ConfigMap
metadata:
name: pod-config
data:
...
WALG_LIBSODIUM_KEY: kPnFWqPFSdTapmN1J36Y4HNDGfEvpDPbUxrM3c2Yfic= # generated by "openssl rand -base64 32"
WALG_LIBSODIUM_KEY_TRANSFORM: base64
...
Apply your configmap to your cluster and wait for the Zalando Operator to resync running clusters. When you shell into the leader pod, you should find the two specified env vars within the directory /run/etc/wal-e.d/env/
. All now created backups from this point on will now be encrypted. You can doublecheck if they really are by downloading whatever lz4
compressed file from your S3 bucket and try to uncompress it. You will end with the following error:
Unrecognized header : file cannot be decoded
So far so good. But you might want to restore those encrypted backups using the Zalando Operator and WAL-G right? As you can read in my article here, most of the environment variables you are specifying to enable a backup, need to be also specified with a CLONE_*
prefix in order to get a restore working. So add the the following variables to the pod-config
configmap as well:
apiVersion: v1
kind: ConfigMap
metadata:
name: pod-config
data:
...
CLONE_WALG_LIBSODIUM_KEY: kPnFWqPFSdTapmN1J36Y4HNDGfEvpDPbUxrM3c2Yfic= # generated by "openssl rand -base64 32"
CLONE_WALG_LIBSODIUM_KEY_TRANSFORM: base64
...
Now you are good to go to restore also encrypted backups as a clone. Test it by reading the above mentioned article. If you haven’t specified the correct key, you will end up with an error message like wrong magic number
in your Postgres pod logs.
Philip