u/Deliaenchanting

How are you balancing resilience vs cost in k8s on aws without the bill getting out of control?

Running a kubernetes setup on aws because someone decided cloud native also means bills higher than our dev salaries. The constant tradeoff make it resilient enough to survive failures, or keep costs low enough that finance doesn't start asking questions.

Spot instances save a lot but disappear right when you need them. Multi AZ works until you see the bill and suddenly everyone is fine with a bit less redundancy. Autoscaling sounds good until its either overprovisioned or you are dealing with OOMKills at 3am. I tried reserved instances, got locked in, regretted it when traffic shifted. Savings plans feel like guessing the future. Managed services help with ops, but you pay for it, and running everything yourself isn't exactly free once you factor in time.

feels like every decision just shifts the problem somewhere else, either cost or reliability.

my question: How are you balancing this in practice, any patterns or setups that keep things stable without costs getting out of control, or is it just constant tuning and tradeoffs?

reddit.com
u/Deliaenchanting — 8 days ago