u/Remarkable_Money9857

▲ 9 r/CFO+1 crossposts

Keeping pace with new ai developments has been difficult.

I've asked my teams to share a quick roundup of what interests us and sharing it across our org. new developments, interesting analysis, tools worth looking at, and what matters to us.

nothing super formal. just a way to avoid missing progress and keeping a pulse on the industry.

Here is what we round up this week, I thought it could be useful to share here and see what others post:

1. Production readiness platform for enterprise ai agents
https://x.com/sir_aymansaleh/status/2051720862869729372

This hits the core problem with AI deployments today. Specifically for us, we aren't able to rigorously generate and test all production scenarios before launching a new ai agent into customer support.

The concept of backtesting an AI agent on your production data before launch seems very obvious in hindsight after I saw this.

2. a16z’s AI adoption data
https://www.a16z.news/p/ai-adoption-by-the-numbers

The interesting part is adoption themes clustering around workflows where the value is easier to see. but most importantly, not all ai capabilities have matured to the point of being good enough (yet) to adopt.

Legal, healthcare admin, and coding are the breakout categories. government looks to be the next massive capability improvement focus.

3. People switching from Claude to Codex
https://community.openai.com/t/introducing-the-new-codex-for-almost-everything/1379125

I’ve been seeing a lot more people say they’ve moved coding workflows from Claude over to Codex and that the new OpenAi model outperforms Opus 4.7.

Whether that holds or not, the point is your workflows shouldn't hold loyalty to any one AI coding tool and your teams should always be testing new coding models when they are released.

4. Reliability race

https://x.com/jdroege/status/2052049364579659849?s=46

This is really good framing and retelling about what ScaleAI is seeing in deployments with regulated industries.

A lot of tools can look good in a demo. But don't hold up in a hospital, bank, insurer, or government workflow.

reddit.com
u/Remarkable_Money9857 — 7 days ago