Cloud & AI — cost + abuse control

Your AI feature can run up an unlimited bill — or be turned against you.

LLM APIs and cloud resources are metered, uncapped, and fail open — to your wallet. One leaked key, one runaway agent loop, or one viral moment can 10× the bill overnight — and an AI feature can be jailbroken, or drained as someone else's free tool. I put hard caps and guardrails on your AI and cloud spend so it fails closed: success can't bankrupt you, and nobody can turn your AI against you. And I tell you, honestly, what to leave alone.

Send me your setup — free look → How I work →

No signup. No sales call. A real engineer — me, not a chatbot or a sales team — reads what you send and replies within a business day.

Latest from the blog

When is on-prem actually cheaper than the cloud? · August 2, 2026
The most-upvoted answer in every 'on-prem vs cloud' thread is also the sharpest: you don't move to the cloud to save money — so for a steady, predictable workload, owning (or renting flat) almost always wins. Here's the honest math, when the cloud still wins, and why the real objection isn't the hardware.
RAG on Postgres — you don't need a vector database · August 1, 2026
Every 'add an AI feature' tutorial tells you to spin up a vector database on step one — a new managed service, a new monthly bill, a new thing to run and secure. For a job the Postgres you already have does just as well. Here's the proof: a tiny open-source demo doing retrieval over my own 50 posts, entirely on Postgres with pgvector — plus the honest line on when you'd actually need a dedicated one.
A self-hosted, read-only archive vs. more Google storage · August 1, 2026
A small law firm kept getting the same nag from Google: your 100 GB is almost full — upgrade or free up space. Instead of paying more every year, we moved the cold half of their Drive and Gmail to a read-only archive on a disk in their own office, deleted the old stuff from Google, and stopped the creep. Here's the whole build — including the file-server migration detail that nearly forced a password reset on everyone, and the bcrypt trick that saved it.
How I'd audit your AI feature in an afternoon · July 31, 2026
The whole series was the playbook; this is how I run it against a real feature in a few hours. Seven questions asked most-dangerous-first — can I see the meter, is there a ceiling, is it on the request path, where's the key, who's calling, what can the model do, what happens when it trips — each with the red flag that means fail-open. The output is a one-page map, not a 40-page report.
Your AI agent has a goal, not a budget · July 31, 2026
Agents are being handed credentials, and the whole cost conversation is still about tokens. But an agent optimizing for an answer will scan the warehouse, re-read the bucket, and start the job again — every one of those is a meter your token cap can't see. The reasoning is billed in tokens; the consequences are billed in AWS. The brake doesn't go in the prompt. It goes in the credential.
A fail-closed AI gateway, live: stopping a runaway agent · July 30, 2026
Two weeks of posts described one thing: a gateway that meters every AI call, caps the spend, and cuts a runaway agent mid-run. Talk is cheap, so I built the minimal version in the open and put it in front of a deliberately broken agent. Watch it stop the exact 'looped-all-weekend' bill in three failed calls — while a legitimate long task next to it runs untouched. Live demo, open-source code, honest scope.

All posts →

What I do: cap it, guard it, own it

Cap it. Hard spend ceilings, per-user quotas, and a kill switch on your AI and cloud — enforced on the request path, where a runaway meter can actually be stopped, not on the invoice where you just read about it.
Guard it. Stop prompt injection, jailbreaks, and free-LLM abuse before your AI feature turns into a surprise bill or a legal liability.
Own it. Keys vaulted, endpoints authed, anomalies alerted — the security discipline your AI feature probably shipped without.

Same job, two surfaces. Cloud cost is where this started — it's the proven, productized side you'll see below. AI is where the meter is now most dangerous: faster, and abusable on top of overspent. I do both, because they're one problem — a meter with no ceiling.

The cloud side is proven — and the numbers are public.

$0.09/GBAWS egress price — unchanged since 2018, while bandwidth costs fell over 50%.

~$53k/yrWhat 50 TB/month of egress costs on AWS — in transfer fees alone.

$70k → ~$0Plex's monthly egress bill after moving traffic off the hyperscaler.

~$2M/yrWhat 37signals (Basecamp/HEY) reported saving by leaving the cloud.

The figures above are AWS, but this isn't an AWS problem: Azure (~$0.087/GB) and Google Cloud (~$0.12/GB) price egress in the same band — they move together because it's a moat, not a cost. Sources are cited in your assessment. The pattern is always the same: the savings grow as you grow, because a server you rent flat doesn't charge you per gigabyte.

To be clear, "leaving" doesn't mean leaving datacenters. You move to the same tier-3 datacenters the big providers use — rented at a flat monthly rate (OVH, Hetzner) or hardware you own. What you leave behind is the per-gigabyte meter and the lock-in, not the reliability.

Not another dashboard. The person who turns the meter off.

Every cost tool on the market sells you visibility — a dashboard that tells you, in a lovely chart, that you're bleeding. None of them stop the bleeding: they don't shut down the idle box, enforce the tags, carry out the migration, or cap a runaway process. Visibility isn't remediation — the bill only drops when someone with the time and the authority executes the change. That's what I am: not a report, an outcome. I find the leak, do the fix, and keep it boring month after month.

And it's the same job on a second, hotter surface now — AI spend. An LLM API is a meter with no ceiling, plus a failure mode cloud never had: abuse. A leaked key or a runaway agent loop can 10× the bill overnight, and the only control that catches it in time lives on the request path, not the invoice. Cloud or AI, it's one problem — a meter with no ceiling — and I put the ceiling back. See the playbook I'm building in the open →

What you actually buy: a bill that stays lean

Most "cut your cloud costs" help is a one-time report. But the meter never stops, and a bill you fix once drifts right back up: traffic grows, someone ships a feature that re-introduces egress, a reserved commitment lapses, a new managed service quietly turns on. So the honest product isn't a document — it's an ongoing job.

Managed Cloud-Exit is that job, done for you. I move the parts of your stack that bleed off the hyperscaler, run them, and keep optimizing the bill every month — and each month you get a plain report of exactly what I saved you against your starting bill.

$2,000/mo · base · + 20% of the savings I can prove

You always keep the other 80% of what I save you, and the base is a fraction of the platform engineer you'd otherwise hire. I earn more only when your bill gets smaller — never when you use more. See the full plan, the math, and the on-ramp →

You don't start here, and you don't pay to find out if it's worth it — start with the free look below.

On the AI side, the work is newer and scoped to your setup rather than a fixed plan: the gateway, the hard caps and per-user quotas, the abuse guardrails, and the keys locked down. Same starting point — the free look — and I tell you exactly where you're fail-open before anyone commits to anything. (I'm building this playbook in the open, one lesson at a time: the series →.)

How you get there

1 · Free bill teardown (24 hours). Send a recent invoice; I send back a one-page estimate of your flat-rate cost and likely annual savings. No obligation.
2 · Cloud-Exit Assessment ($5,000, 7 working days). A line-by-line teardown, a target design with real monthly numbers, a 5-year total cost that includes running it, and an honest "do NOT move this." If you go on to the managed plan, this $5,000 is credited back against your first months.
3 · Done-for-you migration (fixed price, scoped from the assessment). I re-platform the application, move the database safely, and stand up the new setup — then hand you working, documented, fully-owned infrastructure.
4 · Managed Cloud-Exit (ongoing). I run it and keep the bill lean — base + share of proven savings. Cancel anytime; everything is standard open-source you own, with no lock-in to me.

I'll tell you when not to move.

Most "leave the cloud" pitches sell you a migration no matter what. I don't. If your bill is spiky, mostly idle, or dominated by managed services rather than traffic, moving it can cost more once you count the engineering time. The assessment exists to find that out before you spend a cent on a migration — and I won't put you on a managed plan that doesn't save you more than it costs.

Honest math beats a bigger invoice. "Zero ongoing cost" is a myth — someone always runs the servers. That's the job I'm pricing. The savings share is measured against your starting bill, dated and sourced in your monthly report, so you can always see you're ahead. Often you are. Sometimes you wouldn't be, and I'll say so.

What's the catch?

Two fair fears, answered straight — because the honest answer is what makes the savings believable.

"Will it be less reliable?" Not with the right design. The alternative isn't a server in a closet — it's a tier-3 datacenter box (OVH/Hetzner) on the same kind of redundant power and network AWS runs on. 99.9% is very achievable, and on DDoS you're often stronger: OVH and Hetzner bundle multi-terabit scrubbing free that AWS charges $3,000/month for. AWS multi-AZ is genuinely excellent at the last "nine" — so I quote you the honest uptime number for your design, not a slogan. Reliability, honestly →
"Isn't running it a hassle?" Not for you — that's the whole point of the managed plan. I run it: monitoring, patching, backups, on-call, and the monthly hunt for new savings. Under the hood the win is killing the meter, not running everything by hand — flat-rate managed services (managed Postgres, zero-egress storage) keep the convenience without the per-gigabyte bill. And if the savings ever don't survive the cost of running it, I tell you to stay.

Who this is for

A good fit

You shipped an AI or LLM feature fast — and you know its spend isn't capped and its abuse surface isn't guarded
You serve real traffic: media, downloads, streaming, AI inference, busy SaaS
Your AWS/GCP/Azure bill is over ~$8k/month and climbing with usage
Steady, predictable load — not unpredictable spikes
You'd rather not hire a full-time platform engineer to babysit infrastructure
You need to sit outside US jurisdiction — the US CLOUD Act can reach AWS/Azure/GCP data even in their EU regions; an EU provider doesn't (plain EU residency, AWS already offers — this is the narrower legal point)

Not a fit (yet)

Pre-product startups with unknown, spiky growth
Mostly-idle workloads that genuinely benefit from scale-to-zero (paying nothing while idle)
Bills dominated by managed services, not traffic — little to arbitrage
Teams that want the savings but won't let anyone run or maintain the new setup

Free: send me your setup, get a straight answer in 24 hours

The fastest way to know if this is worth your time. Forward a recent cloud invoice or Cost Explorer export — or just describe your AI setup (which models, where the keys live, what's capped) — and I send back a one-page read: where the meter can run away, where you're fail-open or abusable, and the highest-leverage fix. On the cloud side that includes your likely flat-rate cost with the egress line broken out. No obligation, no sales call.

Every message comes straight to me — I read and reply to each one myself, usually within a day, and what readers send shapes what I build next. It's just me for now, so that's genuinely true; it won't be forever.

Prefer email? Send it straight to ami@smallestbusiness.com. Read by me, never shared.

Why you work directly with one engineer

The person who reads your bill is the same person who designs the migration, carries it out, and runs the servers afterwards — month after month. No account managers, no handoffs, no junior doing the work a brochure promised a senior would. I'm not a reseller and not an affiliate — I make money when your infrastructure gets cheaper and stays boring, not when you buy more of something. The base keeps the lights on; the savings share means my upside only grows when yours does.

And I don't just advise — I build. Most people who cut cloud bills are cost analysts: they hand you a report, and you still need an engineer to do the move. I've spent more than 20 years building the software itself — databases, web backends, and the security-sensitive parts like logins, key management, and rate limiting — so I can plan the migration and carry it out, including your database, the part most teams are afraid to touch. That same base — caps, auth, key handling, fail-closed systems — is exactly what putting a ceiling and guardrails on an AI feature takes, which is why the AI work is an extension of what I already do, not a pivot away from it.

One engineer raises a fair question — what happens if I'm unavailable? I answer it head-on: everything I build is standard open-source you (or anyone) can run, fully documented, and yours — no lock-in to me, and the managed plan is cancel-anytime. Here's how I handle that, why one person is an advantage, and where I'm honest about the big dependencies (including Cloudflare) →