Wanted to share an anonymized case study from a recent audit because the breakdown surprised me.
Mid-sized account, multi-region, primary stated region eu-west-1. Scanned all regions, came back with 93 findings totaling $1,299 to $1,350 in monthly waste. The interesting part: 76 of those findings were in us-east-2, not their primary region.
The biggest single finding was an Aurora cluster running on standard storage that should have been on I/O-Optimized.
For anyone unfamiliar: Aurora I/O-Optimized charges ~30% more on the instance hour rate but drops I/O charges to zero. AWS's own breakeven is roughly 25% of your bill spent on I/O. Most production clusters with even moderate write workloads cross that threshold and stay on the default standard config because nobody knows to switch. This one cluster would save $520/mo from flipping the setting.
The rest of the breakdown.
- 3x idle RDS instances (avg CPU < 5%,
DatabaseConnections< 5, no I/O for 7+ days), $146/mo each. $438/mo total. - 3x stopped EC2 instances still attached to EBS volumes for 30+ days, $23/mo each. EBS bills regardless of EC2 state.
- 2x EC2 instances (
GitLab,Jenkinsby tag) eligible for Graviton, ~30% savings on the same instance class. - 24x Lambda functions on
x86with sustained traffic, eligible forarm64(~20% cheaper, runtime supports it). - 12x CloudWatch log groups with retention "Never expire" or 90+ days where 7-30 would suffice.
- Handful of idle
NAT Gateways(no traffic 7+ days), unattachedElastic IPs,Route 53health check on a dead endpoint.
What this confirmed for me. AWS bills are full of resources nobody owns. The fix is rarely architectural. Most of the time it's "switch this setting" or "delete this thing". The hard part isn't the fix, it's finding it.
Happy to break down the detection logic for any specific rule in comments if anyone's curious.