Cloud cost optimization vs. leaving: where each actually wins

TL;DR. There’s a large, fast-growing industry — “cloud cost optimization,” or FinOps (the discipline of managing cloud spend) — whose whole job is to cut your AWS, Azure or Google bill without moving you off it. It works: tools like ProsperOps, Vantage, CloudHealth, Spot, Cast AI and consultancies like the Duckbill Group routinely shave 20–40% off a bill through commitment discounts and waste cleanup. But there’s a structural ceiling: every one of these levers discounts the meter — it doesn’t remove it. When your bill is dominated by egress and steady, always-on compute, no amount of optimization changes the underlying math, and that’s when leaving wins. The honest answer for most teams: optimize first (a lot of it is free money), then leave the workloads where optimization runs out of road.

Two opposite ways to cut a cloud bill

When your AWS bill gets scary, you’ll be pitched two opposite solutions:

Optimize and stay — keep everything on AWS, but pay less for it. This is the FinOps / cost-optimization industry.
Leave (“cloud repatriation”) — move the expensive parts onto flat-rate or owned infrastructure.

They’re usually framed as enemies. They’re not — they’re steps in order. This page is the honest map of what the first one can and can’t do, so you know when it’s enough and when it isn’t.

What “optimize and stay” actually is (a real, big industry)

This isn’t a cottage trade. The cloud cost optimization / FinOps market was roughly $13–15 billion in 2024–25 and is growing ~11–13% a year (Grand View Research; MarketsandMarkets). It exists because cloud waste is enormous — the FinOps Foundation’s State of FinOps 2026 survey (1,192 organizations, $83 billion in combined cloud spend) still ranks cutting waste as the #1 priority.

It pulls two distinct levers:

Rate optimization — pay a lower price for the same usage, mainly through Reserved Instances and Savings Plans (commit to a year or three of usage up front in exchange for a discount). Specialist tools automate this: ProsperOps, Zesty, Spot by NetApp, Usage.ai, Archera.
Usage optimization — use less: rightsizing (matching server size to actual need), turning off idle resources, autoscaling and scheduling, spot instances, and storage tiering. Done by Cast AI, Densify, nOps, the dashboards, and consultancies.

Who does what:

Type	What they sell	Examples
Visibility dashboards	Show where the money goes	Vantage, CloudHealth, Cloudability, CloudZero
Savings / commitment managers	Auto-buy and manage your discounts	ProsperOps, Zesty, Spot, nOps
Managed service providers (MSPs)	Run your cloud and optimize it for you	Rackspace, Bespin Global, NTT Data
Boutique consultancies	Expert audits and retainers	The Duckbill Group

They’re good at this. If you’ve never bought a Reserved Instance, they’ll find you real money — only about half of companies use AWS Reserved Instances at all, and fewer use them well (Flexera; NetApp). Do this first. It’s often the cheapest 20–30% you’ll ever save.

How they charge — and the incentive to watch

The pricing model tells you whose side a vendor is on. Four common ones:

Pricing model	Example	What it does to incentives
A cut of the savings (gainshare)	ProsperOps takes ~20–35% of the savings it produces	✅ Aligned — they only earn when you save. But it’s a permanent tax on the savings, capped at what discounts can do.
Flat monthly software fee	Most dashboards	➖ Neutral — you pay the same whether they save you a lot or a little.
A percentage of your cloud spend	Some MSPs	⚠️ Backwards — the vendor earns more when your bill is bigger. Read these contracts closely.
Fixed fee / retainer	The Duckbill Group (and my own assessment)	✅ Predictable, no conflict — you know the cost up front.

The one to watch is percentage-of-spend: a provider paid a slice of your bill has a quiet reason not to shrink it too far. Gainshare is honest, but remember it’s a recurring cut of the savings — over a few years, paying someone 30% of your discount forever can cost more than fixing the problem once.

The ceiling: optimization discounts the meter — it can’t remove it

Here’s the part the optimization industry doesn’t put on its homepage — in its own governing body’s words. The FinOps Foundation describes commitment discounts (Reserved Instances, Savings Plans) as “coupons”: they lower the rate applied to your metered usage — they do not change the fact that you’re metered, and they don’t reserve or own anything.

So the whole optimize-and-stay model, by design, cannot:

Escape egress — the per-gigabyte fee to send your own data out to your users (see S3 egress cost). There is no Reserved Instance for egress. A coupon on metered traffic is still metered traffic.
Escape the provider’s margin — you’re still renting at AWS / Azure / Google list prices minus a discount. The discount is real; the markup underneath it is bigger.
Escape lock-in — the proprietary managed services and the egress “moat” that make leaving expensive are exactly as sticky after optimization as before.

And the easy wins are running out. That same State of FinOps 2026 survey reports practitioners have “hit the big rocks of waste” — diminishing returns, with the savings that remain smaller and more expensive to chase. Optimization is a one-time step down, not a gentler slope. Your bill still grows with your success; it just grows from a slightly lower starting point.

When optimizing and staying is genuinely the right call

This is not “AWS bad, leave always.” For plenty of workloads, staying and optimizing is correct — and a good advisor will tell you so:

Spiky or seasonal load — if you’re idle half the time and spike hard the rest, the cloud’s pay-for-what-you-use model (plus autoscaling and spot) really is cheaper than renting for your peak 24/7.
Heavy use of managed services — if your bill is mostly high-value managed services rather than raw traffic and compute, replacing them off-cloud can cost more in engineering time than you’d save.
You haven’t done the basics yet — if you’ve never bought a Savings Plan, cleaned up idle resources, or put a cache in front of your traffic, do that first. It often closes the gap and the question of leaving disappears. Even repatriation-friendly analysts say it plainly: “rightsizing, reservations and architecture optimization will often close the gap.” (CIO.com)

If optimization gets your bill where it needs to be, you’re done — don’t move anything. That’s the honest answer, and it’s the one I give in an assessment whenever it’s true.

When leaving wins

Leaving beats optimizing in one specific case: a steady, always-on workload whose bill is dominated by egress and 24/7 compute. Steady workloads get no benefit from the cloud’s pay-as-you-go premium — you’re paying a rental markup for elasticity you don’t use. Optimization discounts that markup; it can’t delete it.

The clearest public example is 37signals (Basecamp / HEY), who left AWS and cut infrastructure spend from about $3.2 million to under $1 million a year — including replacing a ~$1.5M/year S3 bill with 18 petabytes of their own storage hardware, and saving ~$2M/year on compute after buying ~$700k of servers. (Honest caveat: these are the founder’s self-reported figures, comparing cloud bills to their on-prem operating cost; a full accounting must also count hardware refresh, added ops staff, and power/cooling. Treat it as a disclosed real-world case, not an audited universal number.)

The pattern: when the meter — not the waste — is the problem, the only fix is to stop being metered.

The honest decision

Your situation	The right move
Never bought Savings Plans; idle resources everywhere; no CDN	Optimize first — free money on the table
Spiky / seasonal load, or a bill that’s mostly managed services	Optimize and stay — the cloud’s model fits you
Already optimized, but the bill is steady and dominated by egress + 24/7 compute	Leave that part — optimization has run out of road
Not sure which of these is actually you	Get the numbers — that’s what a teardown is for

Optimize-and-stay and leaving aren’t rivals; they’re a sequence. Cut the waste, buy the discounts, put a cache up front — and then, if a steady, traffic-heavy core is still bleeding at metered rates, move that core onto flat-rate or owned infrastructure. Most teams need both, in that order.

Quick wins you can do this week

Check your commitment coverage. If you’re not using Reserved Instances / Savings Plans on steady compute, that’s the fastest discount there is. (Or let a gainshare tool do it — just know it takes a permanent cut.)
Find your idle and oversized resources. Turn off what’s unused; rightsize what’s overbuilt.
Put a cache in front of your traffic to cut egress before you do anything structural.
Then break out the two lines optimization can’t fix — egress and steady 24/7 compute — and price them on flat-rate infrastructure. If those two are most of your bill, optimization won’t be enough.

Not sure whether to optimize and stay or move the bleeding parts off — send me your cloud bill and I’ll show you which lines a FinOps tool can fix and which ones only leaving will, free, in 24 hours. Read by me, never shared. Or see the Cloud-Exit Assessment for the full decision with real numbers.

Sources

FinOps Foundation — Rate Optimization framework (commitment discounts as “coupons” that don’t provision capacity), and State of FinOps 2026 (1,192 orgs, $83B combined spend; waste reduction the #1 priority but with diminishing returns) — finops.org, data.finops.org.
AWS Savings Plans FAQ — “does not provide a capacity reservation”; a pricing-model discount in exchange for a usage commitment — aws.amazon.com/savingsplans/faq.
Market size — Cloud FinOps market ~$13.4B (2024) → ~$32.5B (2033) at ~11% CAGR (Grand View Research); ~$14.9B (2025) → ~$26.9B (2030) at 12.6% CAGR (MarketsandMarkets). Commercial estimates — directional, not exact.
Commitment under-use — roughly half of organizations don’t use AWS Reserved Instances; only ~19% report “making the most” of discount options (Flexera State of the Cloud; NetApp State of CloudOps).
ProsperOps pricing — the “Savings Share,” a percentage of realized savings, explicitly not of spend — prosperops.com/pricing.
The honest line + customer profile — “rightsizing, reservations and architecture optimization will often close the gap”; repatriation suits steady workloads, optimization suits elastic / managed-service-heavy ones (CIO.com, 2025).
37signals figures — annual infrastructure ~$3.2M → under $1M; a ~$1.5M/year S3 bill replaced by 18 PB of Pure Storage; ~$2M/year compute saving on ~$700k of Dell hardware — self-reported (DHH), via The Register (May 2025).