
I got tired of Claude Code silently dying at rate limits during mid task, so I built something
I got tired of Claude Code silently hitting rate limits, so I decided to build something to address the issue.
Imagine you’re 40 minutes into a refactor. Claude is running tools and making progress, then suddenly, everything stops. The session has reached its rate limit without any warning—no alert saying you’re at 95%, just a complete halt. The usage bars are visible in the UI, but the model itself remains unaware of them.
I discovered that Anthropic has a usage API, and Claude Code already possesses hooks to make it work. This led me to create agent-baton, which reads the usage API and installs hooks to make Claude aware of its limits.
Here are the three hooks you can initiate with one command (baton init):
- SessionStart: Fetches usage data and injects it so Claude knows from the first message how much has been used.
- UserPromptSubmit: Performs a time-to-live (TTL) aware check that avoids overwhelming the API. It uses smart caching—checking every 15 minutes when usage is low and once a minute when it's nearing the limit.
- PreToolUse: This is the crucial one; it checks usage mid-task to prevent the scenario where you “started at 93% and ran out of capacity mid-execution,” catching the problem within 1-2 tool calls.
When the warning threshold is reached, it prompts an interactive question using Claude Code's built-in AskUserQuestion tool:
"Claude 5-hour usage is at 91% — you're in the warning zone."
Options include:
- Continue this task
- Write a handoff document
- Switch to lightweight mode
It also handles full agent handoffs by writing a structured markdown handoff and passing work to Cursor, Codex, or Gemini.
You can install it with the following command:
npm install -g u/codeprakhar25/agent-baton && baton init
For more details, visit the GitHub repository.