FinOps tools like Vantage/CloudHealth show the storage waste, but engineers still have to fix it manually. How are you handling this?
Hey everyone,
We’ve been told to cut our AWS bill by around 20% this quarter, so we started looking at the usual stuff.
We set up Vantage, also looked at CloudHealth, and they’re pretty good at showing the obvious waste: idle EC2, unattached Elastic IPs, old snapshots, oversized instances, etc.
That part is fine.
The annoying part is EBS.
The tools are flagging terabytes of overprovisioned storage across live stateful workloads. They’re not wrong either. A lot of these volumes are clearly bigger than they need to be.
But once you ask engineering to actually shrink them, the whole thing gets stuck.
And I get why. The usual process is still basically:
- create a smaller volume
- format/partition it
- rsync or snapshot/migrate
- plan a maintenance window
- stop services
- swap mounts
- test everything
- hope nothing breaks
So now we have a nice dashboard telling us exactly how much money we’re wasting, but no one really wants to own the risk of fixing it manually.
Is everyone else just accepting this as part of the AWS tax, or have you found a better way to bridge the gap between FinOps visibility and actual remediation?
I’ve seen tools like Datafy trying to handle the block storage side more directly, but I’m still skeptical of anything that touches live storage automatically.
Curious what people here are using in practice.