r/FinOps

▲ 2 r/FinOps+3 crossposts

Are cloud architects being asked to do too much now?

I’ve been speaking with cloud and enterprise architecture teams, and one common theme keeps coming up: architects are no longer just designing systems.

They are expected to handle WAF-aligned designs, architecture documents, PRDs, Infrastructure-as-Code, cost estimates, cloud comparisons, security reviews, and stakeholder explanations — often across multiple clouds.

For Azure teams especially, the workload seems to sit across landing zones, governance, identity, networking, security, cost control, and documentation.

Curious how others are handling this.

Are architects in your organisation still focused mainly on design, or are they now expected to produce the full delivery package as well?

Full disclosure: we are building an AI agents to help cloud architects produce WAF-aligned designs, architecture documents, PRDs, IaC, and costing plans. Not posting this as a sales pitch — genuinely interested in how teams are handling this workload today.

reddit.com
u/Accomplished_Job_76 — 1 day ago
▲ 3 r/FinOps

Quick question about your AI costs

How is your team currently tracking LLM API spend?

We're cobbling together spreadsheets and the OpenAI

dashboard, but it feels broken. Curious what others do.

reddit.com
u/MaverikSh — 6 hours ago
▲ 11 r/FinOps

Anyone else going to FinOps X for the first time this year? Any tips?

New to the FinOps community and just want to learn, network. What’s the event like?

reddit.com
u/Life-cyclist — 1 day ago
▲ 1 r/FinOps

AWS utility to scan for idle resources

I built a tool to scan AWS accounts (user provides the session) and analyse resources for periods of idleness with a goal to schedule automatic spin up/down.

Currently it supports EC2/ECS/RDS/NAT gateways. I got some really interesting results.

If you fancy having a look I would love to get some feedback!

github.com
u/Uptime-Scheduler — 1 day ago
▲ 0 r/FinOps

Biggest issues in Finops

Hi everyone,

I’m building a FinOps platform and I’d love to hear from professionals in the field what their biggest issues with current platforms are. I’m currently working with some FinOps professionals but would love to hear from the wider community.

What would make your job easier?
Also how should I go about finding beta testers?
Which providers do you currently use? What do you like about them? What are they missing?
What info do you need but don’t get?

Thanks everyone!

reddit.com
u/Jimjamj438 — 1 day ago
▲ 7 r/FinOps

Our aws bill is spiraling because developers are leaving unattached volumes and idle instances running. I’m looking for compliance automation that can scan our infrastructure daily, flag non-compliant resources, and even shut them down if they aren't tagged correctly.

We need to bring our cloud costs under control without manually auditing every single account every week. Any tools that are easy to set up across multiple regions?

reddit.com
u/Dangerous_Block_2494 — 8 days ago
▲ 28 r/FinOps

Not kidding. I ran a script that lists every EC2 instance with its average CPU over the last 30 days. Found 23 instances under 5%. The oldest: a t2.micro running for 14 months, 0.2% CPU. It was a forgotten VPN jumpbox.

Then I checked unattached EBS volumes. 87 of them. Some from terminated instances that were deleted 2 years ago.

Then RDS snapshots older than 60 days. 400+.

None of this showed up in our monthly cost review because everyone was looking at "big numbers" of EC2 total, RDS total. No one drilled into the tail waste.

Wrote a 50-line Python script using boto3 to tag everything obsolete and send a Slack webhook. Took 2 hours. Automated it weekly.

Now we save ~$16k/month. Literally just turning off and deleting stuff no one needed.

The lesson: before you buy Savings Plans or commit to anything, hunt the low-hanging zombie resources. They're everywhere.

reddit.com
u/CompetitiveStage5901 — 14 days ago
▲ 3 r/FinOps

What values for FinopsException tag?

https://docs.aws.amazon.com/guidance/latest/cloud-intelligence-dashboards/cora-dashboard.html

https://preview.redd.it/efjfsfbi6h0h1.png?width=2481&format=png&auto=webp&s=ffbc44fd49bfa303fb856c5edc691da54ac3d1d2

Looking at the AWS CUDOS reporting tool, and they seem to promote a universally accepted tag name called FinopsException. Very handy as it's baked into CUDOS/CORA and you can set it to remove recommendations on assets that just can't be resized, deleted, and so on.

But, can't find any values they reccommend. Does anyone use this tag to manage Finops exceptions and have some good examples? If not, I can ask the authors

reddit.com
u/classjoker — 3 days ago
▲ 7 r/FinOps

engineering exports a giant CSV, finance asks why is AWS up 14% engineering scrolls horizontally for 20 mins, nobody walks away with an answer. Familiar?

Tried a Sankey instead. Provider -> Account -> Resource Type -> Team. band width = dollars. You see where money flows in 3 seconds.

What works:

  • eye finds the fat band immediately. tables make every row look equal even when one row is 90% of the bill.
  • month-over-month becomes which bands got fatter non-engineers can do that.
  • drill-in is a click, not a filter combo.

What doesn't:

  • bad tagging kills it. 60% untagged = giant grey blob and the CFO notices. Kinda useful tho, forces the tagging convo.
  • doesn't show change over time. Still need a line chart next to it.
  • harder to export for someone who wants to handedit in excel.

anyone built one in-house? What library we ended up on D3 after a few higher-level libs couldn't handle cycles or sub-band labels and does your finance team actually use it or just ask for the CSV anyway?

reddit.com
u/Shoddy_5385 — 6 days ago
▲ 5 r/FinOps

I'm a procurement professional with experience across multiple categories, and over the past few years I've been expanding into SaaS and IT services.

Most IT Procurement Manager roles I'm seeing require cloud experience but honestly, I'm unsure what level of expertise and contribution is actually expected.

 Traditionally, procurement adds value through supplier identification, negotiation, and spend analysis. But with cloud, those levers feel limited:

  • Possibility to negotiate T&C (outside commercials) is limited unless the buyer organization has significant leverage such as high spend, buying from a smaller supplier, government/regulated industry and even them larger suppliers won’t budge (according to survey results described in “Cloud Computing Law, 2^(nd) edition, Oxford University Press)
  • Spend optimisation and cost control often sits with FinOps teams

So where does procurement genuinely add value in cloud purchasing ?

How have you seen procurement professionals make a meaningful contribution to cloud in your organisations?

reddit.com
u/Walking_Blue — 8 days ago
▲ 13 r/FinOps+1 crossposts

Are FinOps Foundation certifications still relevant today? Asking for our team of cloud engineers, trying to optimize our cost and resources?

reddit.com
u/ImpressiveIdea6123 — 13 days ago
▲ 0 r/FinOps

You ever notice how all of these horror stories of clouds spend typically occur over a weekend? It’s because billing data lags behind usage (24-72 hrs depending on your Cloud provider). It’s because people are actually paying attention first thing Monday morning and whatever state things were in Friday (when attentiveness is down) has now hit the dashboard (that assumes you’re looking at the right dashboard and not just waiting for the monthly bill). If your daily spend is $10k, a 72-hour billing delay (standard for AWS/Azure Rating Latency) results in $30,000 of unrecoverable spend before an alert even fires.

I was getting asked by our CFO about the bill and retroactively looking at reports (Cloudability and native Azure/AWS) but the approach of playing investigator was annoying. Coming from an infrastructure background I expected to be alerted when things happened not find out after the fact only (didn’t monitoring software solve this like 10 years ago?!?!). I built my own solution for our use case… But I’m wondering why no one else is bothered by this.

reddit.com
u/Artistic_Lock_6483 — 11 days ago
▲ 8 r/FinOps

We run a mix of AWS and GCP across a few teams and every month there’s some surprise spike from instances or clusters that got scaled up and never came back down.

Right now we rely on basic alerts like CPU thresholds, but that’s too late. By the time something triggers, the cost is already there.Trying to figure out how to catch this earlier, not just after the fact, but at the point where something is being overprovisioned or scaled incorrectly.

we looked at a few tools, but they feel heavy for what we need and don’t really solve the underlying issue.

What’s actually working for you to catch overprovisioning early without constant manual tracking?

reddit.com
u/SalamanderFew1357 — 8 days ago
▲ 7 r/FinOps

We have a single NAT gateway shared across 20 dev namespaces in EKS. Also a single EKS control plane (obviously). The NAT gateway costs 0.045/GBprocessedplusthehourlyfee.Thecontrolplaneis0.045/GBprocessedplusthehourlyfee.Thecontrolplaneis0.10/hr.

Right now we just split it equally across all teams. But one team does 80% of the data transfer through NAT. Another team runs only two pods and barely touches it. The equal split feels unfair but tracking actual usage per pod or per namespace through VPC Flow Logs and tagging is a nightmare.

I tried using VPC Flow Logs + Athena to attribute NAT traffic by source private IP, then map IP to namespace. Works but the queries are slow and expensive. Also doesn't handle the control plane cost at all.

What's everyone else doing? Do you just accept shared costs as overhead? Or do you have a clean way to charge back per team for things that aren't naturally tagged?

reddit.com
u/CompetitiveStage5901 — 14 days ago
▲ 1 r/FinOps

Hey folks,

Been following a lot of discussions here around cost visibility, tagging chaos, and surprise AWS bills — and honestly, we’re seeing the same patterns across most orgs.

We’re an AWS APN Partner working with startups and mid-size teams, and one thing we’ve consistently noticed:

Most teams are overspending ~25–35% on AWS without realizing it due to idle resources, wrong sizing, or poor architecture decisions. �

Stripe Systems

At the same time, security misconfigurations are quietly sitting in the background (open ports, IAM issues, unused access keys, etc.) — which is a bigger risk than cost itself.

So we’ve started offering something simple:

👉 Free AWS Cost Optimization + Security Audit Report (no remediation push)

What we check:

Idle / underutilized resources (EC2, RDS, EBS, etc.)

Rightsizing opportunities + Savings Plans / RI gaps

Data transfer & NAT cost leaks

Tagging & cost allocation hygiene

IAM risks, exposed services, security posture

Billing anomalies & future risk areas

From what we’ve seen in real projects, even basic FinOps practices like rightsizing + governance can lead to 30–70% savings without touching code. �

ZeonEdge

Why we’re doing this free:

Mostly to understand real-world challenges + build long-term relationships (no lock-in, no obligation).

Also — for eligible startups, there are AWS credits support programs (up to $100K) depending on stage and use case.

reddit.com
u/Robinson2502 — 13 days ago