Managed database without AWS: HA, failover and upgrades
TL;DR. “Managed database” (RDS, Azure Database, Cloud SQL) bundles far more than the nightly backup: automatic failover (Multi-AZ — a standby copy in a second data center that takes over automatically if the first fails), version upgrades and patching, read replicas, monitoring, connection management and storage autoscaling. Off-cloud, most of these are easy — read replicas, monitoring and pooling are solved with boring open-source tools. Two are genuinely hard: automatic failover and zero-downtime major upgrades. That’s where “managed” earns its money. So the honest answer for most small and scale-up teams isn’t “self-host everything” — it’s often a flat-rate managed Postgres (OVH, Scaleway, Aiven, Crunchy Bridge): you escape AWS’s metered pricing while keeping a fully managed database. You’re not leaving “the cloud” — OVH and Hetzner are cloud too — you’re leaving the hyperscaler’s pricing model (AWS, Azure, Google — the big three), and you don’t have to self-host the database to do it.
”Managed” is a bundle — unbundle it before you price it
Backups and point-in-time recovery are only one thing RDS does for you. The full bundle, and what each part takes to run yourself:
| Managed feature | Open-source / owned equivalent | How hard, honestly |
|---|---|---|
| Automated backups + PITR | pgBackRest (see Part 1) | 🟢 Easy — fully solved |
| Read replicas | native streaming replication | 🟢 Easy |
| Monitoring / Performance Insights | Prometheus postgres_exporter + Grafana, pgwatch, PgHero | 🟢 Easy |
| Connection management | PgBouncer / PgCat | 🟢 Easy |
| Parameter groups | postgresql.conf + config management | 🟢 Easy |
| Storage autoscaling | provision generously, grow ZFS/LVM, alert at 75% | 🟡 Moderate |
| Automatic failover (Multi-AZ high availability) | Patroni + etcd + HAProxy | 🔴 Real work — the hard part |
| Version upgrades / patching | pg_upgrade, logical-replication cutover, your own window | 🔴 Ongoing labour |
The green rows are an afternoon each. The red rows are the reason to think twice — so let’s be honest about them.
The easy parts (don’t overthink these)
- Read replicas — Postgres has built-in streaming replication. Point a standby at the primary, it stays current, you read from it. Logical replication handles selective/cross-version cases. This is a config block, not a project.
- Monitoring —
postgres_exporter→ Prometheus → Grafana gives you the dashboards RDS Performance Insights does (slow queries, locks, cache hit rate, replication lag), and you own the data. PgHero is a one-container quick win; pgwatch is a fuller option. - Connection pooling — PgBouncer in front of Postgres absorbs connection storms the way RDS Proxy does, on one small process.
- Parameters — RDS parameter groups are just
postgresql.confwith a UI. Put the file in your config management and you have the same thing, version-controlled.
None of these is why people fear leaving RDS. The next two are.
The hard part 1: automatic failover (Multi-AZ HA)
RDS Multi-AZ keeps a synchronous standby in another availability zone and fails over automatically in ~60–120s if the primary dies. Reproducing that — not just having a replica, but promoting it safely and automatically without split-brain — is the real engineering.
The standard open-source answer is Patroni:
- Patroni runs alongside each Postgres node and manages replication + automatic failover, using a distributed consensus store (etcd, Consul, or ZooKeeper) to elect the primary and prevent split-brain.
- HAProxy (or the Patroni-aware PgBouncer) sits in front and always routes writes to the current primary via Patroni’s health endpoint.
- Spread the nodes across different datacentres/regions (OVH regions, Hetzner’s multiple DCs) to get true “Multi-AZ” resilience.
Lighter options if full Patroni is more than you need: pg_auto_failover (simpler, a monitor + two nodes) or repmgr (replication + assisted failover, less hands-off). For MySQL/MariaDB the equivalents are Galera (synchronous multi-master), Orchestrator + ProxySQL, or InnoDB Cluster (Group Replication + MySQL Router).
The honest caveats: a consensus store you must run and keep quorum on; synchronous replication adds write latency; and a misconfigured failover can cause split-brain or data loss. This is doable and well-documented — it is not a config block. Budget real setup and a tested failover drill, or use a provider that runs it for you.
The hard part 2: version upgrades and patching
RDS applies minor patches in your maintenance window and offers (near) one-click major upgrades. Self-hosted:
- Minor upgrades (e.g. 16.3 → 16.4) — a package update + restart. Easy, but you schedule and own the window.
- Major upgrades (e.g. 16 → 17) —
pg_upgradedoes an in-place upgrade with brief downtime; a near-zero- downtime path uses logical replication to a new-version node, then a cutover. That logical-replication dance is real work and needs rehearsal. - OS patching, security CVEs — also yours now. Unattended-upgrades + a reboot policy covers most of it.
This isn’t hard so much as never finished — it’s standing operational responsibility. If nobody on the team will own a quarterly patch + upgrade rhythm, that’s a strong signal to stay managed.
The honest middle path: flat-rate managed Postgres
Here’s the move most “leave AWS” pitches miss. You can escape AWS’s metered pricing without self-hosting your database — by moving it to a flat-rate managed Postgres from a non-hyperscaler (still managed, still cloud, just not billed the hyperscaler way):
- OVHcloud Managed Databases, Scaleway Managed Database — EU, predictable pricing, HA included.
- Aiven (EU-based), Crunchy Bridge, Timescale Cloud — managed Postgres specialists, run it anywhere.
- Neon / Supabase — managed Postgres with modern DX if that fits.
They handle automatic failover, upgrades and backups, at flat, predictable rates. Two clarifications, because the cost story is easy to get wrong:
- Your database is probably not your egress problem. Inside one AWS region, app↔database traffic is not billed at the $0.09/GB internet rate — so moving the DB saves little internet egress. The big egress line lives in your serving traffic (downloads, media, API responses, object storage) going out to users; that’s what you move to owned/flat infrastructure for the egress win.
- The DB savings come from elsewhere: flat, predictable instance + HA pricing instead of RDS’s premium and Multi-AZ instance doubling, no surprise storage/IOPS scaling, and sidestepping the cross-AZ / cross-region transfer fees AWS charges when your app and database (or replicas) span zones or regions.
So solve each problem where it’s cheapest: owned/flat infrastructure for the egress-heavy traffic, a flat-rate managed database for predictable DB cost — without self-running HA. This is usually the highest-ROI answer for a small team: it captures most of the savings (which live in egress and serving, not the DB) without taking on the one operational burden — HA failover — that bites hardest when you’re understaffed.
Decision guide
| Your situation | Best call |
|---|---|
| Real ops skill, want full control, DB big enough to justify | Self-host with Patroni (+ pgBackRest) |
| Want to escape AWS egress/metering, but not run HA yourself | Flat-rate managed Postgres (OVH/Scaleway/Aiven/Crunchy) ← most teams |
| Tiny team, zero ops capacity, DB deeply wired into other AWS services | Keep RDS — the labour saved is worth the premium, for now |
Notice the egress math is the same in all three: the database is seldom where your cloud bill bleeds. So you can move the bleeding parts off AWS and keep a managed database — the two decisions are independent.
This applies to Azure and GCP too
The bundle is identical across hyperscalers: Azure Database for PostgreSQL / Flexible Server and GCP Cloud SQL / AlloyDB sell the same managed failover, upgrades, replicas and monitoring. So the same unbundling — easy parts you self-run, hard parts you either engineer with Patroni or buy from a flat-rate specialist — works whether you’re leaving AWS, Azure, or Google.
Quick wins you can do this week
- Unbundle your own RDS. List which managed features you actually rely on. Most teams use backups + one replica + monitoring — all green-row easy.
- Stand up monitoring first (
postgres_exporter+ Grafana) so you can see your database before you move it. - Price a flat-rate managed Postgres (OVH/Scaleway/Aiven) against your RDS line — including the Multi-AZ instance doubling and any cross-AZ/cross-region transfer (not internet egress — that lives in your serving traffic).
- Only commit to self-hosted Patroni if someone will own failover drills and the upgrade rhythm. Otherwise, buy that part.
Whether your database should self-host, move to a flat-rate managed provider, or stay put is exactly the kind of honest call in a Cloud-Exit Assessment — with the real numbers and where the savings actually live. Or send me your cloud bill and I’ll show you which lines are the database (often not the problem) versus egress (usually is), free, in 24 hours. Read by me, never shared.
Sources
- Patroni (HA templates, DCS-backed automatic failover) — patroni.readthedocs.io; pg_auto_failover — pg-auto-failover.readthedocs.io; repmgr — repmgr.org.
- PostgreSQL streaming + logical replication,
pg_upgrade— the PostgreSQL documentation. - PgBouncer / PgCat; postgres_exporter + Prometheus/Grafana; PgHero, pgwatch — the respective project docs.
- MySQL/MariaDB HA: Galera Cluster, Orchestrator, ProxySQL, MySQL InnoDB Cluster — respective docs.
- Managed Postgres (flat-rate / non-hyperscaler): OVHcloud Managed Databases, Scaleway Managed Database, Aiven, Crunchy Bridge, Timescale Cloud, Neon, Supabase — provider pricing pages.
- Backups half of this story — RDS backups + EBS snapshots without AWS; the full service map — open-source AWS managed-service alternatives.