SmallestBusiness Less lock-in, simpler tools, predictable cost

CloudWatch costs too much: the self-hosted stack that cuts it ~90%

TL;DR. CloudWatch is priced so that watching your servers can cost as much as running them. Log ingestion is about $0.50/GB, every custom metric is about $0.30/month, and ad-hoc log queries (Logs Insights) are billed per GB scanned. At a few terabytes of logs a month, that is thousands of dollars just to see what your system is doing. The open-source replacement — Prometheus + Grafana + Loki (or the leaner VictoriaMetrics / VictoriaLogs) on one flat-rate box — does the same job for roughly 90% less, and you stop being billed by the gigabyte for your own telemetry. This is usually the highest-percentage saving in a whole cloud-exit. Below: what CloudWatch actually costs at scale, the stack that replaces it, and the honest cases where you should keep some CloudWatch anyway.


Why CloudWatch surprises people

Monitoring feels like it should be cheap — it is “just logs and numbers.” CloudWatch’s pricing model turns that intuition upside down, because it meters the three things that grow fastest in a busy system:

None of these are abusive on their own. Together, on a system at real scale, they produce a monitoring bill that routinely lands in the thousands per month — to observe infrastructure that may itself cost less than that.

What CloudWatch costs at scale

Approximate monthly cost of log ingestion alone, before metrics, queries, dashboards or alarms. Compared with a self-hosted stack on one flat-rate box. Prices approximate, June 2026 — verify before quoting.

Logs ingested / month CloudWatch ingest Self-hosted (box + ops)
100 GB ~$50 rent of one box (~$50–90)
500 GB ~$250 rent only
1 TB ~$500 rent only
5 TB ~$2,500 rent + ops time
10 TB ~$5,000 rent + ops time

Now add the parts that usually dominate the real bill:

Add-on CloudWatch Self-hosted
1,000 custom metrics ~$300/mo $0 (Prometheus scrapes for free)
10,000 custom metrics (cardinality) ~$3,000/mo $0
Ad-hoc log queries (2 TB scanned/mo) ~$10/mo + re-scans $0 (Loki/VictoriaLogs query for free)

A mid-size system easily reaches $1,000–4,000/month on CloudWatch. The same observability on one Hetzner/OVH box running Prometheus + Grafana + Loki costs the rent of the box — call it $50–90 plus the ops time to run it — a ~90% cut, often more as metric cardinality grows.

The honest footnote. That box is not free. Someone configures the stack, sets retention, and keeps it patched — priced at engineer rates a self-run observability stack is realistically $200–500/month all-in. It still beats four-figure CloudWatch bills decisively; the point is to compare against the honest number, not a fantasy “$0.”

The stack that replaces it

Tool Replaces Notes
Prometheus CloudWatch custom metrics Pull-based metrics, the de-facto standard. Scrapes exporters; no per-metric charge.
Grafana CloudWatch dashboards Dashboards + alerting UI over all of the below. Far better visualisations than CloudWatch.
Loki CloudWatch Logs Log aggregation that indexes labels, not full text — cheap to store, fast to query.
Alloy / Vector / Fluent Bit the CloudWatch agent Ship logs and metrics from your hosts to Loki/Prometheus.
VictoriaMetrics / VictoriaLogs Prometheus + Loki (leaner) Drop-in, far more resource-efficient at scale — fewer/cheaper boxes for the same volume.
Netdata per-second host metrics Zero-config, per-second granularity; great for single-host and edge.
Uptime Kuma CloudWatch Synthetics / status Self-hosted uptime checks + a public status page.

A common, boring, effective layout: Alloy ships logs+metrics → VictoriaMetrics + VictoriaLogs store them → Grafana displays and alerts. One box for most teams; two for redundancy.

When to keep some CloudWatch (don’t rip it all out)

I tell clients where CloudWatch genuinely earns its place:

The win is concentrated where the meter runs hardest: high log volume and high-cardinality custom metrics. That’s what to move first.

How the migration actually goes

  1. Stand up the stack on one box (Prometheus/VictoriaMetrics + Grafana + Loki/VictoriaLogs).
  2. Ship telemetry in parallel. Point Alloy/Vector/Fluent Bit at the new stack alongside CloudWatch — dual for a week so you can compare and trust it.
  3. Rebuild the dashboards and alerts in Grafana (often clearer than the originals).
  4. Cut over app logs and custom metrics; keep the thin CloudWatch slice for AWS-native alarms/metrics.
  5. Turn down CloudWatch retention on the moved log groups (retention is itself billed) and watch the next invoice drop.

The reason this is low-risk: you run both side by side until the self-hosted view is the one you instinctively open during an incident. Then the CloudWatch bill stops being the price of knowing what your system is doing.


A four-figure CloudWatch bill is exactly the kind of line I pull apart in a Cloud-Exit Assessment — with the real numbers and a target stack sized to your team. Or send me a recent cloud bill and I’ll break out the monitoring spend and estimate your saving in 24 hours, free. Read by me, never shared.