Categories
Kubernetes

Migrating from Zalado’s postgres-operator to CloudNativePG

I recently migrated from Zalando’s postgres-operator to CloudNativePG. It was complicated and not well-documented enough that I felt it was worth a post (my first post in nearly six years!).

Motivation

When I adopted postgres-operator in late 2023/early 2024, it was one of the most popular operators, and it met my requirements (whatever they were; I was not good at taking notes for personal projects). Things got a bit rocky in 2025, however. First, Immich dropped its Postgres subchart, and I had to figure out how to migrate the data and use postgres-operator. Then Immich migrated to VecotrChord, which was another complicated thing I had to figure out how to do with postgres-operator. Additionally, with release 1.15.0, there were missing images that took months to fix, and during that time my backup process was not running because of the missing images. It was not clear if I could just downgrade the operator or not.

Given all the difficulties I was experiencing, I decided to reconsider my use of the operator. CloudNativePG has had a pretty impressive growth trajectory, and lots of folks seem to be using it to self-host Immich. In some of the discussions around both issues I had to navigate with Immich, lots of folks seemed to be using it and sharing their fixes, which would have made my life a lot easier at the time. I pretty quickly identified that it met my needs, and decided to try it out. Now I had to figure out how to migrate some of my databases over.

Migration

I did this in a few more steps than I probably had to, but I found it easier to reason about in my piecemeal approach (when I had time to migrate a database).

There were two resources I found to be incredibly useful while navigating this (in addition to CloudNativePG’s own documentation):

Custom Image

The postgres image that postgres-operator uses has two extensions included and used by default: pgaudit/set_user and powa-team/pg_stat_kcache. In order for CloudNativePG to be able to clone from my existing cluster, I needed to have these two extensions installed. I made an image based on CloudNativePG’s image for postgres-17, and installed those two extensions. This will be used in the spec.imageName in the Cluster resource.

Cloned Cluster

For each migration, I started with a throw-away CloudNativePG cluster that would import from the existing database. Before I did anything, I scaled the workload down so that no data would change in the database.

---
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: clone-cluster
spec:
  bootstrap:
    initdb:
      database: imported-database
      import:
        type: microservice
        databases:
          - imported-database
        source:
          externalCluster: to-be-cloned
      owner: imported-database-owner
  enablePDB: false
  externalClusters:
    - connectionParameters:
        host: postgres-operator-postgresql
        user: imported-database_owner_user
        dbname: imported-database
      name: to-be-cloned
      password:
        key: password
        name: imported-database-owner-user.postgres-operator-postgresql.credentials.postgresql.acid.zalan.do
  imageName: ghcr.io/sdwilsh/zlando-cnpg-migration:17-latest@sha256:fa4c3afd2bb178791ea1c5b680be30a12a77370e36a9ccf94593ccb835061862
  instances: 1
  postgresql:
    shared_preload_libraries:
      - set_user
      - pg_stat_statements
      - pg_stat_kcache
  primaryUpdateStrategy: unsupervised
  storage:
    size: 10Gi
    storageClass: cnpg-data-encrypted-storage

When using this, I would replace the following:

  • imported-datbase with the name of the database I was migrating.
  • imported-database-owner with the username of the owner of the new database.
  • postgres-operator-postgresql with the name of the postgresql resource defining the cluster managed by postgres-operator.
  • imported-database_owner_user with the name of the owner user of the cluster managed by postgres-operator

I would run watch kubectl cnpg status clone-cluster (with the kubectl extension) until the status was reported as “Cluster in healthy state”. Then I would run kubectl exec -it pod/clone-cluster-1 -- psql -U postgres -d imported-database and execute the following SQL to remove the use of the two extensions:

DROP VIEW IF EXISTS metric_helpers.table_bloat;
DROP VIEW IF EXISTS metric_helpers.pg_stat_statements;
DROP VIEW IF EXISTS metric_helpers.index_bloat;
DROP VIEW IF EXISTS metric_helpers.nearly_exhausted_sequences;
DROP FUNCTION IF EXISTS user_management.terminate_backend(pid integer);
DROP FUNCTION IF EXISTS user_management.revoke_admin(username text) ;
DROP FUNCTION IF EXISTS user_management.random_password(length integer);
DROP FUNCTION IF EXISTS user_management.drop_user(username text);
DROP FUNCTION IF EXISTS user_management.drop_role(username text);
DROP FUNCTION IF EXISTS user_management.create_user(username text);
DROP FUNCTION IF EXISTS user_management.create_role(rolename text);
DROP FUNCTION IF EXISTS user_management.create_application_user_or_change_password(username text, password text);
DROP FUNCTION IF EXISTS user_management.create_application_user(username text);
DROP FUNCTION IF EXISTS metric_helpers.pg_stat_statements(showtext boolean);
DROP FUNCTION IF EXISTS metric_helpers.get_nearly_exhausted_sequences(double precision);
DROP FUNCTION IF EXISTS metric_helpers.get_table_bloat_approx(OUT t_database name, OUT t_schema_name name, OUT t_table_name name, OUT t_real_size numeric, OUT t_extra_size double precision, OUT t_extra_ratio double precision, OUT t_fill_factor integer, OUT t_bloat_size double precision, OUT t_bloat_ratio double precision, OUT t_is_na boolean) ;
DROP FUNCTION IF EXISTS metric_helpers.get_btree_bloat_approx(OUT i_database name, OUT i_schema_name name, OUT i_table_name name, OUT i_index_name name, OUT i_real_size numeric, OUT i_extra_size numeric, OUT i_extra_ratio double precision, OUT i_fill_factor integer, OUT i_bloat_size double precision, OUT i_bloat_ratio double precision, OUT i_is_na boolean);
DROP EXTENSION IF EXISTS set_user;
DROP EXTENSION IF EXISTS pg_stat_kcache;
DROP EXTENSION IF EXISTS pg_stat_statements;
DROP SCHEMA IF EXISTS user_management;
DROP SCHEMA IF EXISTS metric_helpers;

Destination Cluster

This cluster is meant as the final resting place of the data. It is very likely I could have just changed the imageName in the cloned cluster, but I found the extra step marginally useful enough to keep doing (especially when I forgot to drop the extensions because import took a while).
The important part here is adding the spec.bootstrap.initdb.postInitApplicationSQL. Most of my applications did not specify which schema to use, and they seemed to end up writing all their data to data. However, this is not part of the search path in the CloudNativePG cluster, so it looked like all the data was dropped (scary!). Since the use of the extensions was dropped, this is just a clone of the previously created clone-cluster using the postgres-17 image from CloudNativePG.

---
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: destination-pg-cluster
spec:
  backup:
    target: primary
  bootstrap:
    initdb:
      database: imported-database
      import:
        type: microservice
        databases:
          - imported-database
        source:
          externalCluster: clone-cluster
      owner: imported-database-owner
      postInitApplicationSQL:
        - ALTER USER imported-database-owner SET search_path TO "$user", data, public
  externalClusters:
    - connectionParameters:
        host: clone-cluster-rw
        user: imported-database-owner
        dbname: imported-database
      name: clone-cluster
      password:
        key: password
        name: clone-cluster-app
  imageName: ghcr.io/cloudnative-pg/postgresql:17@sha256:b473a9c10debd74827c2966632012c84e335ccacff1d87da4ad4facf96a62e21
  instances: 1
  storage:
    size: 10Gi
    storageClass: cnpg-data-encrypted-storage

When using this, I would replace the following (not repeating the things stated in the “Cloned Cluster” section):

  • destination-pg-cluster with the name of the cluster we want to use.

Once this cluster is healthy, you can safely drop the spec.boostrap.initdb.import and spec.externalClusters section and adjust the spec.instances to whatever size you want. It is also now safe to delete the clone-cluster and point your application at the new database. The Secret provided by CloudNativePG actually has more information provided in it, so more can be injected from the secret instead of hard-coding things like the URI, database name, or service name.

Categories
Technology

Network Booting a Raspberry Pi 4 with an iSCSI Root via FreeNAS

If you are interested in booting your Raspberry Pi 4 without local storage, this guide can help you accomplish it. While most tutorials cover how to do this with NFS, this one uses iSCSI. Additionally, with the latest firmware update for the Pi 4 (as of 2020-04-16), setting up network booting is much simpler. Other guides require modifying the local DHCP server or spinning up a proxy DHCP server to get this working.

Context

I decided that I wanted to get the Raspberry Pi systems I have scattered about into a more maintainable state. I have read countless tales of SD card failures with Pis, and if I’m being honest, I don’t have a good backup strategy for them. I do, however, have a FreeNAS machine, with plenty of space. That got me thinking about how I could utilize that to solve the problem.

Thanks

This post by XLAB helped point me in the right direction, but I found the Pi 4 is sufficiently different from the Pi 3 that it did not work verbatim. My friend Sharon W. helped out a bunch by copy-editing this post.

Requirements

  • Raspberry Pi 4
  • SD card for the Pi
  • FreeNAS (screenshots from 11.3, but older versions probably work)
  • Linux machine (I spun up an Ubuntu Server 20.04 VM for this on Hyper-V)

I found it valuable to have a monitor attached to the Pi when debugging why something was not working correctly. You can get through this guide without one, however.

Assumptions

This guide has some assumptions baked in. These things are true in my environment, but may not be in yours. These steps may work without these assumptions, but I have not tested them. Most of these are easy to work around by changing netboot-pi-config.json, which will be introduced below.

  • You will use a wired connection (via eth0) to connect to the Pi
  • You will statically assign an IP address to the Pi
  • You have IPv6 set up on the local network
  • Your local timezone is US/Pacific
  • You have a local NTP server
  • You are comfortable with public key authentication set up for sshing into the root account on the Pi

Setting up FreeNAS

TFTP Server

I created a new user and group, tftp, and set the permissions on the directory I planned to share accordingly (owned by root, with the group being tftp). The FreeNAS documentation may be useful when setting this up. Here is how I configured this:

NFS Server

In order to share the TFTP folder for the Pi, which contains the contents of the /boot folder, NFS sharing will also need to be set up. Using NFS like this allows the Pi to mount it under /boot, such that any updates that happen in the folder are updated on the TFTP server as well for the next boot.

This step does require that the MAC address of the Pi is known. If it is not already known, skip this for now and come back to it when it is collected off of the Pi later in the guide.

The FreeNAS documentation may be useful, especially if you plan to deviate from this guide. Here is how I configured this:


Note: after everything is setup, you can add the hostname to restrict access to this folder to just the Pi. While doing the setup, however, the Ubuntu server will also need access to the NFS share.

iSCSI Server

FreeNAS now has a wizard that makes this easy to set up. I entered this information into the wizard, and it set everything up correctly:

Building a Custom Image of Raspberry Pi OS (32-bit)

I plan to do this with a bunch of Pis, so I spent the time setting up Packer, along with a plugin to support arm images to make generating an image much easier and repeatable. In my Ubuntu Server VM, I created this config file:

Then, I installed the required packages, Packer, and the arm image builder plugin.

There are many environment variables that will be used throughout this guide on the Ubuntu Server. Be sure to update them to reflect the local environment.

Setting up the Pi 4

Installing

I grabbed the image (located in output-arm-image/image) with WinSCP, flashed it to an SD card with BalenaEtcher, placed the card in the Pi, and powered it up. If a monitor is attached, ssh in once the Pi enables sshd. Otherwise, just wait a few minutes and then connect.

Before moving on, collect the MAC address from the Pi.

This information will be needed back on the imaging server. If the NFS step was skipped above because it was not yet known, go back and set the NFS share up as well.

Updating the EEPROM

The Pi 4 uses an SPI-attached EEPROM to boot the system instead of bootcode.bin that older models of the Pi used (more can be read about it in the EEPROM documentation). The firmware and configuration on the Pi will need to be updated to set up network booting. The documentation for these bootloader settings covers a lot more options that are not utilized here. In order to see the current configuration, run:

When I wrote this, pieeprom-2020-04-16.bin is the current stable release. Be sure to check for newer stable releases, and then reference the EEPROM documentation to see if any additional settings should be set. This guide relies on features that only became available in pieeprom-2020-04-16.bin.

Setting up the TFTP Environment

The initramfs has to be created, and then the entire /boot folder copied over to the TFTP server. Be sure to update the TFTP_ROOT environment variable to be the path the TFTP server serves files from on the FreeNAS machine.

It is worth noting that the call to update-initramfs is tied to the currently running kernel version. As a result, future updates to the kernel will not be reflected in the initramfs without running this command again. I have not had to handle that yet, and it appears that this will get easier to manage in the future. It is an exercise for the reader to tackle this problem, and this Stack Exchange thread has some solutions.

Power down the Pi, and remove the SD card.

Setting up the iSCSI Device

Back on the Ubuntu Server where image was built for the Pi, the iSCSI device the Pi will use as its root device can now be setup.

Connecting to the iSCSI Device

This code relies on environment variables that were set in previous steps on this machine, so if this is a new shell, be sure to copy and paste those in as well.

I am unaware of a programmatic way to determine what device maps to the iSCSI connection that was just made. It might be /dev/sdb or something under /dev/mapper/. Use lsblk --output NAME,KNAME,TYPE,SIZE,MOUNTPOINT to help figure out which device actually represents the iSCSI device.

Creating and Populating the Root Partition

Creating the new partition for the Pi is fairly straightforward. This will create a single partition taking up the entire device. Be sure to update the ISCSI_DEVICE environment variable with the proper device from the previous step.

A new device will be created for the partition. Depending on the original device, it could be /dev/sdb1 or something like /dev/mapper/mpathb-part1. Be sure to update the ISCSI_ROOT_PARTITION environment variable with the proper device for the partition.

Updating /etc/fstab

/etc/fstab needs to be updated to properly mount /boot and / when using the iSCSI device. Be sure to update the PI_MAC environmental variable (that was taken from the Pi earlier) as well as the TFTP_ROOT environment variable.

Update PARTUUID on the TFTP Server

The last step is updating the PARTUUID that is in the cmdline.txt file that the Pi boots with (from the TFTP server) to match the one in the new partition that was created on the iSCSI device.

Booting

After making sure the SD card has been removed from the Pi, it should boot from the network once power is turned on!

Debugging

If things are not working, I strongly suggest hooking up a monitor to the Pi to verify configuration files or what step of the boot process is failing.

TFTP Connections

To see if the Pi is even talking to the TFTP server, check the request logs by running tail -f /var/log/xferlog on the FreeNAS machine. To see the raw traffic to the TFTP server, run tcpdump -vv -i igb0 port 69 (updating igb0 with the network interface used by the TFTP server).

iSCSI Connections

To see if the Pi is even talking to the iSCSI server, check the raw network traffic on the port by running tcpdump -vv -i igb0 port 3260. If there is more than one device connecting to the server, add a host filter.

Categories
Mozilla

Building Chef DK on FreeBSD 11.0

With the help of tBunnyMan’s post, I managed to get the Chef DK running inside a jail on my FreeBSD box.

After you’ve done your initial setup in the jail, you’ll want to also setup sudo in your jail and allow anybody in the wheel to have password-less sudo (you can modify the file by hand if you want to see what it’s doing):

# pkg install sudo
# sed -ie 's/#\(%wheel ALL=\)/\1/' /usr/local/etc/sudoers

Now, create the user that will run the setup:

# adduser
Username: chef
Full name: chef
Uid (Leave empty for default):
Login group [chef]:
Login group is chef. Invite chef into other groups? []: wheel
Login class [default]:
Shell (sh csh tcsh nologin) [sh]:
Home directory [/home/chef]:
Home directory permissions (Leave empty for default):
Use password-based authentication? [yes]:
Use an empty password? (yes/no) [no]:
Use a random password? (yes/no) [no]: yes
Lock out the account after creation? [no]:
Username   : chef
Password   : 
Full Name  : chef
Uid        : 1001
Class      :
Groups     : chef wheel
Home       : /home/chef
Home Mode  :
Shell      : /bin/sh
Locked     : no
OK? (yes/no): yes

We need an older version of devel/gecode, so now we have to downgrade it. This step will take a while if you have a CPU that isn’t very fast.


# su chef
# cd ~
# sudo pkg install portdowngrade
# sudo portdowngrade devel/gecode r345033
# cd gecode
# sudo make deinstall install clean

We are not yet done with gecode, however. A pull request to dep_selector added a dependency on GECODE_VERSION_NUMBER, which isn’t properly defined in /usr/local/include/gecode/support/config.hpp, so we have to fix it.

# sudo sed -ie 's/\(#define GECODE_VERSION_NUMBER\)\s*/\1 300703/' /usr/local/include/gecode/support/config.hpp

Almost there! No we can install our other dependencies and checkout the git repo.

# cd ~
# sudo pkg install ruby rubygem-bundler git
# git clone https://github.com/chef/chef-dk.git
# cd chef-dk
# USE_SYSTEM_GECODE=1 bundle install --without development

This will at least let you build the Chef DK. As I go further down this rabbit hole, I may end up putting up more posts on how I got Chef setup on FreeBSD.

Categories
Technology

UniFi Controller on DreamHost VPS

I recently purchased a UniFi UAP-PRO for my home wireless. I choose it because it is commercial grade hardware with good management software for a low price (comparatively). It then occurred to me that I could take advantage of my DreamHost VPS that I barely use to host the controller software so I don’t need to bother having it on any of my local computers. The EdgeRouter Lite makes it trivial to automatically point your access points to a place in the cloud with a given IP address, so the hardest part was going to be getting the software running on my VPS.

Once I got on a newer version of DreamHost’s VPS offering (I was on something running Debian 5 before I switched to one running Ubuntu 12.04), I had a bit of a rocky start. Some instructions I found online were outdated and had me install a very old version of the controller software. I was trying to import the settings I had done on my local controller so I didn’t have to set everything up again, and that import process wasn’t going to work out with that old controller software. I’ve got it working now, so I wanted to share the steps that worked for me so hopefully nobody else has to go through the pains I did.

Step One: Get a newer version of MongoDB

We’ll want to get a newer version than what is installed by default, so simply follow the instructions from MongoDB (version 2.4).

Step Two: Follow the release instructions to install the controller

As of this writing, 4.6.6 is the latest version. In the announcement thread for that version, search for “UniFi Controller APT howto”, and follow those instructions (skipping step two since we did that in step one from this blog post).

Step Three: Load our controller and import our config

I exported my local controller’s config (Settings -> Maintenance -> Download Backup Settings) before doing this next step. When we navigate to our server’s address (over https on port 8443), we’re given the option to import a config. Once we’ve imported it, the service will restart, and then we’ll be able to point our access points to our controller. Note: we can also create a completely new config.

Step Four: Set the Controller Hostname/IP

The last step is to open the Settings pane, clicking the Controller tab and entering the hostname or IP address of our controller.

Categories
Mozilla Personal

Social Plugins’ Memory Usage

Dietrich recently posted about the memory usage of social plugins, and I found the results rather surprising because, at least in the case of Facebook, I didn’t think it ever loaded enough code to consume 20+MB of memory.

When I first learned about social plugins, I thought that they were a really cool idea and thought that they had a lot of potential. If they use a ton of memory though, I feel like it’s a bit of a deal breaker to using them. So, being the curious engineer that I am, I decided to test this out myself. I conducted these tests in a new Firefox profile and I was not signed into Facebook (to try and replicate the experience Dietrich had).

One Like Button

For my first test, I had a very simple page for the default like social plugin pointing to my site.

like page result

One like button doesn’t seem to add much, which is good!

Two Like Buttons

The next test I tried was duplicating the like button so it showed up twice. This code is a bit naive since I duplicate a <div> element with the same id and don’t need to include the JavaScript twice. However, it shows what someone who would just copy and paste will end up with, which I think is valuable.

like page (two button) result

As you can see, memory usage nearly doubled. This is a bit surprising since the exact same JavaScript is included. I would expect there to not be any additional shapes, but that nearly doubles. scripts and mjit-code also all double, and I would expect that at least the latter to not.

A more interesting version of this test would be to not include the JavaScript twice, and just add one additional <fb:like> button that doesn’t like the same url.

two like button test results

Interestingly, memory usage did not change significantly from the duplicate resource case! So, what exactly is going on here? This page ends up loading four additional resources:

File HTTP Status Size Mime Type
all.js 304 143KB application/x-javascript
login_status.php 200 58b text/html
like.php 200 33KB text/html
like.php 200 33KB text/html

That is 209KB of HTML and JavaScript that is being sent for two like buttons. Something tells me that part of the problem here is that Facebook is sending more than it needs to for this (I did not look into exactly what was being sent). The good news is that 143KB comes from the browser’s cache.

Send Button

The last test I did was the send button pointing to my website.

send test results

Given that the like button test includes a send button as well, I’m not surprised to see that this used even less memory.

Summary

I think there are are two problems here:

  1. Firefox should create less shapes and do a better job of not duplicating the same JavaScript code in a given compartment.
  2. Facebook needs to send less data down for their social plugins. I have a hard time believing that that much JavaScript is needed in order to display a like button, a share button, and a faces of your friends who have liked a page.

It’d be interesting to see how these numbers change when you are logged in, but I don’t have time to do that analysis. I’ve provided all the code and steps I used to get these results, so it shouldn’t be too hard for someone else to come along and do that if they are interested. Another interesting test would be to see how the Twitter and Google+ integrations break down too (but I leave that as an exercise for the reader).