Files
turingpi/TODO-cloudnativepg-migration.md
gilgamezh d6ee993a60 docs: add CloudNativePG migration TODO for postgresql
Plan to move both Bitnami postgres instances (pgsql PG16 in default,
gitea-postgresql PG17 bundled in gitea) to CloudNativePG, since Bitnami
images are frozen (bitnamilegacy). Not executed -- planning doc only.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31 11:15:54 +02:00

7.7 KiB
Raw Permalink Blame History

TODO: Migrate PostgreSQL to CloudNativePG

Status: planned, not started Target: CloudNativePG (CNPG operator) Created: 2026-05-31

Why

Both Postgres instances run Bitnami images, which Bitnami froze in Aug 2025 (free images moved to the bitnamilegacy archive and stopped getting updates / security patches):

Instance Namespace Version Image How deployed Consumers
pgsql default PG 16.2 docker.io/bitnami/postgresql:16.2.0-debian-12-r8 (non-legacy, tag effectively gone from the registry) standalone helm release pgsql (chart postgresql-15.1.2) lidarr, radarr, sonarr, prowlarr, alhfmf
gitea-postgresql gitea PG 17.6 bitnamilegacy/postgresql:17.6.0-debian-12-r4 bundled subchart inside the gitea helm release gitea only

CloudNativePG is the chosen replacement: actively maintained, operator-managed HA/backups, first-class pg_dump-based import for migration, and not tied to Bitnami.

⚠️ pgsql is on the non-legacy bitnami/ path whose 16.2.0 tag is no longer pullable — if that pod is ever rescheduled to a node without the image cached, it will fail to start. Treat instance A as the higher priority.

Databases to migrate

  • pgsql (PG16): alhfmf, lidarr_db, lidarr_db_log, radarr_db, radarr_db_log, sonarr_db, sonarr_db_log, prowlarr_db, prowlarr_db_log (confirm full list at cutover with \l).
  • gitea (PG17): gitea (owner role gitea).

0. Prerequisites

  • Install the CNPG operator (pin a recent stable version):
    helm repo add cnpg https://cloudnative-pg.github.io/charts
    helm repo update cnpg
    helm upgrade --install cnpg cnpg/cloudnative-pg \
      -n cnpg-system --create-namespace --wait
    kubectl get deploy -n cnpg-system   # operator Running
    
  • Decide on storage: reuse nfs-client StorageClass, or prefer local storage for the DB PVCs (Postgres on NFS is workable but not ideal; CNPG defaults to RWO). Pick a storageClass + size per cluster below.
  • Decide CNPG topology: homelab can run instances: 1 (no HA) to keep it light on the Pi nodes; bump to 23 later if desired.
  • Take a manual backup of both instances first (safety net):
    kubectl exec -n default pgsql-postgresql-0 -- \
      bash -c 'PGPASSWORD=$POSTGRES_PASSWORD pg_dumpall -U postgres' > pgsql-all.sql
    kubectl exec -n gitea gitea-postgresql-0 -- \
      bash -c 'PGPASSWORD=$POSTGRES_PASSWORD pg_dumpall -U postgres' > gitea-all.sql
    

Instance A — pgsql (media *arr + alhfmf), PG16

CNPG can pull data straight from the old server at bootstrap via initdb.import (it runs pg_dump/pg_restore for you). Use monolith type to copy all listed databases + roles in one shot.

  • Keep the old pgsql release running during migration (read source).
  • Create a secret with the old server's superuser creds for the importer:
    # password: kubectl get secret pgsql-postgresql -n default -o jsonpath='{.data.postgres-password}' | base64 -d
    kubectl create secret generic pg16-source-superuser -n default \
      --from-literal=username=postgres --from-literal=password='<OLD_PW>'
    
  • Apply a CNPG Cluster with import bootstrap (sketch — tune names/size):
    apiVersion: postgresql.cnpg.io/v1
    kind: Cluster
    metadata:
      name: pg16
      namespace: default
    spec:
      instances: 1
      imageName: ghcr.io/cloudnative-pg/postgresql:16   # match major 16
      storage:
        size: 10Gi
        storageClass: nfs-client            # or local SC — see prereqs
      bootstrap:
        initdb:
          import:
            type: monolith
            databases: ["alhfmf","lidarr_db","lidarr_db_log","radarr_db","radarr_db_log","sonarr_db","sonarr_db_log","prowlarr_db","prowlarr_db_log"]
            roles: ["*"]
            source:
              externalCluster: old-pgsql
      externalClusters:
        - name: old-pgsql
          connectionParameters:
            host: pgsql-postgresql.default.svc.cluster.local
            user: postgres
            dbname: postgres
          password:
            name: pg16-source-superuser
            key: password
    
  • Wait for import to finish: kubectl get cluster pg16 -n default -w (phase → Cluster in healthy state). Check logs of the bootstrap job.
  • Verify row counts / \dt in a couple of DBs vs. the old server.
  • Cutover: the new service is pg16-rw.default.svc.cluster.local:5432. Update each consumer's DB host:
    • lidarr — custom_helm_charts/lidarr / helm-values/lidarr_values.yaml
    • radarr — helm-values/radarr_values.yaml
    • sonarr — helm-values/sonarr_values.yaml
    • prowlarr — custom_helm_charts/prowlarr / helm-values/prowlarr_values.yml
    • alhfmf — alhfmf release (find its DB env/secret) (the *arr apps store DB host/creds in their config.xml/Postgres env — confirm where each one is configured before flipping.) Commit + push so Argo redeploys; verify each app reconnects.
  • Soak for a few days.
  • Decommission old: helm uninstall pgsql -n default, delete its PVC (data-pgsql-postgresql-0) and resources/pgsql_persistent_volume.yml, remove non_argo_values/pgsql_values.yaml, drop pg16-source-superuser.

Instance B — gitea DB, PG17

Trickier because Postgres is a subchart of the gitea release, so it must be externalized from the gitea chart.

  • Stand up a CNPG Cluster pg17 in the gitea ns, PG major 17 (imageName: ghcr.io/cloudnative-pg/postgresql:17), same import approach: databases: ["gitea"], roles: ["*"], source = gitea-postgresql.gitea.svc.cluster.local, superuser secret from kubectl get secret gitea-postgresql -n gitea -o jsonpath='{.data.postgres-password}'.
  • Quiesce gitea during final cutover (scale gitea deploy to 0) so the source DB is static, then run/refresh the import (or do a final pg_dump of gitea and restore into pg17) to avoid losing writes.
  • Edit non_argo_values/gitea_values.yaml:
    • set postgresql.enabled: false (drop the bundled subchart)
    • point gitea.config.database at the CNPG service:
      gitea:
        config:
          database:
            DB_TYPE: postgres
            HOST: pg17-rw.gitea.svc.cluster.local:5432
            NAME: gitea
            USER: gitea
            PASSWD: <from CNPG-managed secret pg17-app>
      
      (CNPG creates an app user/secret; either reuse the migrated gitea role or wire gitea to the pg17-app secret.)
  • helm upgrade gitea ... (Recreate strategy already set). Bring gitea back up; verify login + that the GitOps repo (admin/turingpi.git) is intact — Argo depends on this, so validate before moving on.
  • Decommission: once postgresql.enabled: false is live, the old gitea-postgresql StatefulSet is removed by the chart; delete leftover PVC data-gitea-postgresql-0 after confirming the new DB is good.

Rollback

  • Old instances stay intact until the explicit decommission steps — to roll back, just point the consumer/gitea config back at the original *-postgresql service. Keep the pg_dumpall files until both soaks pass.

Open questions / decisions

  • NFS vs local storage for DB PVCs (perf + RWO behavior on node loss).
  • HA (instances: 1 vs 3) given Pi resource limits.
  • Backups: configure CNPG scheduled backups (Barman to NFS or object store) as part of this — replaces whatever (if any) backup the Bitnami charts did.
  • Confirm exact per-app DB connection config for the 5 consumers of pgsql before cutover (where each *arr stores its Postgres host/creds).