From d6ee993a60f1befccb5e43eed92841ef7daae594 Mon Sep 17 00:00:00 2001 From: gilgamezh Date: Sun, 31 May 2026 11:15:54 +0200 Subject: [PATCH] docs: add CloudNativePG migration TODO for postgresql Plan to move both Bitnami postgres instances (pgsql PG16 in default, gitea-postgresql PG17 bundled in gitea) to CloudNativePG, since Bitnami images are frozen (bitnamilegacy). Not executed -- planning doc only. Co-Authored-By: Claude Opus 4.8 --- TODO-cloudnativepg-migration.md | 174 ++++++++++++++++++++++++++++++++ 1 file changed, 174 insertions(+) create mode 100644 TODO-cloudnativepg-migration.md diff --git a/TODO-cloudnativepg-migration.md b/TODO-cloudnativepg-migration.md new file mode 100644 index 0000000..ed34923 --- /dev/null +++ b/TODO-cloudnativepg-migration.md @@ -0,0 +1,174 @@ +# TODO: Migrate PostgreSQL to CloudNativePG + +**Status:** planned, not started +**Target:** [CloudNativePG](https://cloudnative-pg.io/) (CNPG operator) +**Created:** 2026-05-31 + +## Why + +Both Postgres instances run **Bitnami** images, which Bitnami froze in +Aug 2025 (free images moved to the `bitnamilegacy` archive and stopped +getting updates / security patches): + +| Instance | Namespace | Version | Image | How deployed | Consumers | +|---|---|---|---|---|---| +| `pgsql` | `default` | PG **16.2** | `docker.io/bitnami/postgresql:16.2.0-debian-12-r8` (non-legacy, tag effectively gone from the registry) | standalone helm release `pgsql` (chart `postgresql-15.1.2`) | lidarr, radarr, sonarr, prowlarr, alhfmf | +| `gitea-postgresql` | `gitea` | PG **17.6** | `bitnamilegacy/postgresql:17.6.0-debian-12-r4` | **bundled subchart** inside the `gitea` helm release | gitea only | + +CloudNativePG is the chosen replacement: actively maintained, operator-managed +HA/backups, first-class `pg_dump`-based import for migration, and not tied to +Bitnami. + +> ⚠️ `pgsql` is on the **non-legacy** `bitnami/` path whose `16.2.0` tag is no +> longer pullable — if that pod is ever rescheduled to a node without the image +> cached, it will **fail to start**. Treat instance A as the higher priority. + +## Databases to migrate + +- **pgsql (PG16):** `alhfmf`, `lidarr_db`, `lidarr_db_log`, `radarr_db`, + `radarr_db_log`, `sonarr_db`, `sonarr_db_log`, `prowlarr_db`, `prowlarr_db_log` + (confirm full list at cutover with `\l`). +- **gitea (PG17):** `gitea` (owner role `gitea`). + +--- + +## 0. Prerequisites + +- [ ] Install the CNPG operator (pin a recent stable version): + ```bash + helm repo add cnpg https://cloudnative-pg.github.io/charts + helm repo update cnpg + helm upgrade --install cnpg cnpg/cloudnative-pg \ + -n cnpg-system --create-namespace --wait + kubectl get deploy -n cnpg-system # operator Running + ``` +- [ ] Decide on storage: reuse `nfs-client` StorageClass, **or** prefer local + storage for the DB PVCs (Postgres on NFS is workable but not ideal; + CNPG defaults to RWO). Pick a `storageClass` + size per cluster below. +- [ ] Decide CNPG topology: homelab can run `instances: 1` (no HA) to keep it + light on the Pi nodes; bump to 2–3 later if desired. +- [ ] Take a manual backup of both instances first (safety net): + ```bash + kubectl exec -n default pgsql-postgresql-0 -- \ + bash -c 'PGPASSWORD=$POSTGRES_PASSWORD pg_dumpall -U postgres' > pgsql-all.sql + kubectl exec -n gitea gitea-postgresql-0 -- \ + bash -c 'PGPASSWORD=$POSTGRES_PASSWORD pg_dumpall -U postgres' > gitea-all.sql + ``` + +--- + +## Instance A — `pgsql` (media *arr + alhfmf), PG16 + +CNPG can pull data straight from the old server at bootstrap via +`initdb.import` (it runs `pg_dump`/`pg_restore` for you). Use **monolith** type +to copy all listed databases + roles in one shot. + +- [ ] Keep the old `pgsql` release **running** during migration (read source). +- [ ] Create a secret with the old server's superuser creds for the importer: + ```bash + # password: kubectl get secret pgsql-postgresql -n default -o jsonpath='{.data.postgres-password}' | base64 -d + kubectl create secret generic pg16-source-superuser -n default \ + --from-literal=username=postgres --from-literal=password='' + ``` +- [ ] Apply a CNPG `Cluster` with import bootstrap (sketch — tune names/size): + ```yaml + apiVersion: postgresql.cnpg.io/v1 + kind: Cluster + metadata: + name: pg16 + namespace: default + spec: + instances: 1 + imageName: ghcr.io/cloudnative-pg/postgresql:16 # match major 16 + storage: + size: 10Gi + storageClass: nfs-client # or local SC — see prereqs + bootstrap: + initdb: + import: + type: monolith + databases: ["alhfmf","lidarr_db","lidarr_db_log","radarr_db","radarr_db_log","sonarr_db","sonarr_db_log","prowlarr_db","prowlarr_db_log"] + roles: ["*"] + source: + externalCluster: old-pgsql + externalClusters: + - name: old-pgsql + connectionParameters: + host: pgsql-postgresql.default.svc.cluster.local + user: postgres + dbname: postgres + password: + name: pg16-source-superuser + key: password + ``` +- [ ] Wait for import to finish: `kubectl get cluster pg16 -n default -w` + (phase → `Cluster in healthy state`). Check logs of the bootstrap job. +- [ ] Verify row counts / `\dt` in a couple of DBs vs. the old server. +- [ ] **Cutover:** the new service is `pg16-rw.default.svc.cluster.local:5432`. + Update each consumer's DB host: + - [ ] lidarr — `custom_helm_charts/lidarr` / `helm-values/lidarr_values.yaml` + - [ ] radarr — `helm-values/radarr_values.yaml` + - [ ] sonarr — `helm-values/sonarr_values.yaml` + - [ ] prowlarr — `custom_helm_charts/prowlarr` / `helm-values/prowlarr_values.yml` + - [ ] alhfmf — `alhfmf` release (find its DB env/secret) + (the *arr apps store DB host/creds in their `config.xml`/Postgres env — + confirm where each one is configured before flipping.) + Commit + push so Argo redeploys; verify each app reconnects. +- [ ] Soak for a few days. +- [ ] Decommission old: `helm uninstall pgsql -n default`, delete its PVC + (`data-pgsql-postgresql-0`) and `resources/pgsql_persistent_volume.yml`, + remove `non_argo_values/pgsql_values.yaml`, drop `pg16-source-superuser`. + +--- + +## Instance B — `gitea` DB, PG17 + +Trickier because Postgres is a **subchart of the gitea release**, so it must be +externalized from the gitea chart. + +- [ ] Stand up a CNPG `Cluster` `pg17` in the `gitea` ns, PG major **17** + (`imageName: ghcr.io/cloudnative-pg/postgresql:17`), same import approach: + `databases: ["gitea"]`, `roles: ["*"]`, source = + `gitea-postgresql.gitea.svc.cluster.local`, superuser secret from + `kubectl get secret gitea-postgresql -n gitea -o jsonpath='{.data.postgres-password}'`. +- [ ] **Quiesce gitea during final cutover** (scale gitea deploy to 0) so the + source DB is static, then run/refresh the import (or do a final `pg_dump` + of `gitea` and restore into `pg17`) to avoid losing writes. +- [ ] Edit `non_argo_values/gitea_values.yaml`: + - set `postgresql.enabled: false` (drop the bundled subchart) + - point `gitea.config.database` at the CNPG service: + ```yaml + gitea: + config: + database: + DB_TYPE: postgres + HOST: pg17-rw.gitea.svc.cluster.local:5432 + NAME: gitea + USER: gitea + PASSWD: + ``` + (CNPG creates an app user/secret; either reuse the migrated `gitea` role or + wire gitea to the `pg17-app` secret.) +- [ ] `helm upgrade gitea ...` (Recreate strategy already set). Bring gitea back + up; verify login + that the GitOps repo (`admin/turingpi.git`) is intact — + **Argo depends on this**, so validate before moving on. +- [ ] Decommission: once `postgresql.enabled: false` is live, the old + `gitea-postgresql` StatefulSet is removed by the chart; delete leftover + PVC `data-gitea-postgresql-0` after confirming the new DB is good. + +--- + +## Rollback + +- Old instances stay intact until the explicit decommission steps — to roll + back, just point the consumer/gitea config back at the original + `*-postgresql` service. Keep the `pg_dumpall` files until both soaks pass. + +## Open questions / decisions + +- [ ] NFS vs local storage for DB PVCs (perf + RWO behavior on node loss). +- [ ] HA (`instances: 1` vs `3`) given Pi resource limits. +- [ ] Backups: configure CNPG scheduled backups (Barman to NFS or object store) + as part of this — replaces whatever (if any) backup the Bitnami charts did. +- [ ] Confirm exact per-app DB connection config for the 5 consumers of `pgsql` + before cutover (where each *arr stores its Postgres host/creds).