u/SalamanderFew1357

▲ 8 r/FinOps

We run a mix of AWS and GCP across a few teams and every month there’s some surprise spike from instances or clusters that got scaled up and never came back down.

Right now we rely on basic alerts like CPU thresholds, but that’s too late. By the time something triggers, the cost is already there.Trying to figure out how to catch this earlier, not just after the fact, but at the point where something is being overprovisioned or scaled incorrectly.

we looked at a few tools, but they feel heavy for what we need and don’t really solve the underlying issue.

What’s actually working for you to catch overprovisioning early without constant manual tracking?

reddit.com
u/SalamanderFew1357 — 8 days ago

we made a scaling decision last quarter that looked fine on paper. ran it through the aws cost calculator, felt reasonable. bill came back 40% higher than we projected mostly from data transfer costs between services we didn't model right.

By the time the invoice showed up we already had two other services depending on that setup. Unwinding it would have taken longer than just paying the difference.

Is this just how cloud works or is there a way to get closer to the real number before you deploy anything?

reddit.com
u/SalamanderFew1357 — 20 days ago