u/alexdunlop_

TLDR

How tf do people use this as a daily driver without smashing caps? I love this tool but I feel like I’m throwing money at the wall.

I have come from using 2 Claude Code subscriptions (1 personal & 1 with work) and a Cursor subscription.

I love Pi and the idea behind it. Being able to completely control the harness. After the recent regressions of Claude Code I was looking for alternative (didn’t want to fall in the same trap with allowing someone to control my harness).

I started using Pi and loved it at first. I have a Z.ai coding plan, however I’m constantly hit the 5 hour cap.

Then I decided to try the Codex Pro plan and hit the 5 hour cap after one hour of intense coding.

I had set reasoning effort from medium, then have tried low. It helped a bit but not amazingly.

Other things I’ve tried are Semble & Caveman mode for less token usage.

However I’m starting to wonder, have I not optimised my setup enough, is this normal?

Is this only viable with a local or high end coding plan.

How do you guys use this as a main driver and what advice do you have?

I’ve been trying the packages (however the page keeps timing out for me lol, so I can’t use it).

I’ve been playing with my system prompt and trying to keep it short & concise to reduce tokens. I removed all MCPs.

It’s started to make me question if I’m missing some kind of caching and optimisations most harnesses have built in.

I have spent a lot of time searching reddit/blogs/articles/wikis. I found most peoples answers were "I found x good or this good", or "here's my personal breakdown".

Personally I prefer seeing usage at scale. From my reading of Reddit, a lot of us use OpenRouter in shape or form (unless you are a chad running locally, then ignore this post big boss).

However I finally found a way I like to compare and try different models, with or without OpenRouter.

https://preview.redd.it/wm7bal3ixuzg1.png?width=905&format=png&auto=webp&s=31c105538d99dba9a28a620c4a7fa81b43cdba4f

OpenRouter have a great page to see how everyones using a given model on a given day:
https://openrouter.ai/apps/hermes-agent

Then personally I take all the top preforming models and run them through the comparison:
https://openrouter.ai/compare

https://preview.redd.it/7gy2xsrsxuzg1.png?width=1255&format=png&auto=webp&s=d29ac1437ba03336ac02dac89ecbca241f53da05

What I personally look out for is:

- Input.
- Output.
- Caching (true, helps with cost).
- Agentic performance.
- Intelligence performance.

I personally don't care for Coding analysis, as that's not how I use Hermes (I use Claude Code & Pi.dev on other machines).

So you can see for my given use case (and trying to keep costs down DeepSeek is the best).

https://preview.redd.it/gjls5v32yuzg1.png?width=421&format=png&auto=webp&s=281b652012d98bce860361014db4a9e88ef12e42

What I like about the model comparison is you can weigh things up based on your needs rather than someone on reddit saying: "just use this, next question".

Hopefully someone finds this helpful, I personally would have loved to come across these sooner.

How do you use Pi without running out of usage