u/Sad_Source_6225

i built a open source cli for reducing token waste in claude code / codex workflows
▲ 7 r/AIDeveloperNews+1 crossposts

i built a open source cli for reducing token waste in claude code / codex workflows

ai coding agents (claude code, codex, cursor) burn tokens on things that don't help you ship. i started digging through local claude code + codex logs after burning way more tokens than i expected and realized a huge amount of the waste was context related: generated artifacts, oversized instruction files, repeated tool output, broad repo exploration, stale session state, etc.

so i built prismodev, a local cli that reads repo files + local claude code/codex logs and surfaces token/context waste. no api keys, no login, nothing leaves your machine.

npx getprismo doctor scans your repo and local session logs, flags missing .claudeignore / .cursorignore, finds oversized CLAUDE.md / AGENTS.md files, detects generated artifacts/logs/build output getting pulled into context, estimates avoidable spend, generates compact .prismo context packs, and shows a before/after score. it went from 79 → 91 on my repo in one run.

npx getprismo watch adds live context-pressure monitoring during sessions and catches repeated file reads, generated artifact leaks, oversized tool output, and possible command/tool loops before they spiral. watch --auto continuously updates a live guardrails file with the current issue and exact instructions for the agent to follow as context pressure changes.

npx getprismo watch --rescue generates a paste-ready recovery prompt when a session starts going sideways and pushes the agent back toward the smallest useful context/workflow.

npx getprismo firewall auth-bug creates a scoped context policy before a task starts so the agent stays inside a smaller context boundary instead of wandering through the whole repo.

npx getprismo cc timeline generates a postmortem timeline showing what leaked into context, which files/commands repeated, and where tool-output spikes happened during expensive claude code sessions.

everything runs locally. reads logs from ~/.codex/sessions/ and ~/.claude/projects/.

github: github.com/shanirsh/prismodev

would genuinely love feedback on false positives, missing waste patterns, or workflows that create the most context bloat.i built a open source cli for reducing token waste in claude code / codex workflows

u/Sad_Source_6225 — 1 day ago
▲ 3 r/developer+2 crossposts

i built a open source cli for reducing token waste in claude code / codex workflows

ai coding sessions get bloated fast, and it’s hard to see what actually caused the cost growth. i started digging through local claude code + codex logs after burning way more tokens than i expected and realized a huge amount of the waste was context related: generated artifacts, oversized instruction files, repeated tool output, broad repo exploration, stale session state, etc.

so i built prismodev, a local cli that reads repo files + local claude code/codex logs and surfaces token/context waste.

npx getprismo doctor scans your repo and local session logs, flags missing .claudeignore / .cursorignore, finds oversized CLAUDE.md / AGENTS.md files, detects generated artifacts/logs/build output getting pulled into context, estimates avoidable spend, and generates compact .prismo context packs for your agent.

npx getprismo watch adds live context-pressure monitoring during sessions and catches repeated file reads, generated artifact leaks, oversized tool output, and possible command/tool loops before they spiral.

there’s also npx getprismo watch --rescue, which generates a recovery prompt when a session starts going sideways and pushes the agent back toward the smallest useful context/workflow.

npx getprismo cc timeline generates a postmortem timeline showing what leaked into context, which files/commands repeated, and where tool-output spikes happened during expensive claude code sessions.

everything runs locally. no api keys, no login, no uploads.

github: github.com/shanirsh/prismodev

would genuinely love feedback on false positives, missing waste patterns, or workflows that create the most context bloat.

u/Sad_Source_6225 — 1 day ago

i built a cli that shows why your claude code / codex sessions get expensive

i was spending way more than i expected on claude code and codex and couldn’t figure out why until i dug into the local session logs. turns out half the context every session was garbage: build artifacts, log directories, generated files, oversized instruction files, repeated tool output, etc. in one repo i had a CLAUDE.md silently loading thousands of tokens into basically every prompt.

so i built a local cli to surface all of it.

npx getprismo doctor scans your repo + local claude code/codex logs, shows what made sessions expensive, flags token/context waste, estimates avoidable spend, and generates smaller focused context packs so your agent doesn’t have to drag your entire repo into every request.

there’s also npx getprismo watch for live monitoring of context spikes, recursive loops, generated artifact leaks, and oversized tool output, plus npx getprismo cc timeline which shows a postmortem timeline of what actually made a session expensive.

github: github.com/shanirsh/prismodev

would genuinely love feedback on false positives, things it should catch, or workflows that create the most token waste.

u/Sad_Source_6225 — 2 days ago

i built a cli that shows why your claude code / codex sessions get expensive

i was spending way more than i expected on claude code and codex and couldn’t figure out why until i dug into the local session logs. turns out half the context every session was garbage: build artifacts, log directories, generated files, oversized instruction files, repeated tool output, etc. in one repo i had a CLAUDE.md silently loading thousands of tokens into basically every prompt.

so i built a local cli to surface all of it.

npx getprismo doctor scans your repo + local claude code/codex logs, shows what made sessions expensive, flags token/context waste, estimates avoidable spend, and generates smaller focused context packs so your agent doesn’t have to drag your entire repo into every request.

there’s also npx getprismo watch for live monitoring of context spikes, recursive loops, generated artifact leaks, and oversized tool output, plus npx getprismo cc timeline which shows a postmortem timeline of what actually made a session expensive.

github: github.com/shanirsh/prismodev

would genuinely love feedback on false positives, things it should catch, or workflows that create the most token waste.

u/Sad_Source_6225 — 2 days ago
▲ 11 r/OpenSourceeAI+5 crossposts

i built a cli that shows why your claude code / codex sessions get expensive

i was spending way more than i expected on claude code and codex and couldn’t figure out why until i dug into the local session logs. turns out half the context every session was garbage: build artifacts, log directories, generated files, oversized instruction files, repeated tool output, etc. in one repo i had a CLAUDE.md silently loading thousands of tokens into basically every prompt.

so i built a local cli to surface all of it.

npx getprismo doctor scans your repo + local claude code/codex logs, shows what made sessions expensive, flags token/context waste, estimates avoidable spend, and generates smaller focused context packs so your agent doesn’t have to drag your entire repo into every request.

there’s also npx getprismo watch for live monitoring of context spikes, recursive loops, generated artifact leaks, and oversized tool output, plus npx getprismo cc timeline which shows a postmortem timeline of what actually made a session expensive.

github: github.com/shanirsh/prismodev

would genuinely love feedback on false positives, things it should catch, or workflows that create the most token waste.

u/Sad_Source_6225 — 1 day ago
▲ 2 r/FinOps

Building a AI cost control layer — looking for FinOps feedback

I’m building Prismo (https://getprismo.dev/) , an open-source AI cost control layer for teams using OpenAI, Anthropic, Gemini, and other model providers. The router/proxy is open source here: https://github.com/shanirsh/prismorouter

The thing I’m trying to figure out is whether teams mainly need another dashboard after the bill lands, or whether the more useful layer is before that: request-level attribution, spend by feature/user/route/model, budget alerts before usage gets out of hand, and routing between models/providers based on cost and reliability.

I also shipped a free local CLI called PrismoDev as the developer wedge for codex and claude code workflows: https://github.com/shanirsh/prismodev

You can run:

bash

npx getprismo scan --usage

npx getprismo cc

It scans repo/context waste, reads local Claude Code/Codex logs when available, shows Claude Code cost drivers, estimates avoidable spend, and generates smaller context packs for AI coding agents.

I’m trying to understand how FinOps teams think about this. Is the bigger pain vendor/tool reporting, or request-level attribution? Do you actually need per-request cost data, or are daily project/user aggregates enough? Who owns AI spend today: finance, engineering, product, or platform? And would routing/budget enforcement matter, or is reporting enough?

Would genuinely appreciate feedback, criticism, or pointers to how your team is handling AI spend.

u/Sad_Source_6225 — 5 days ago