u/stosssik

▲ 1 r/vercel

What are your biggest pains running AI SDK apps in production?

I'm trying to understand what teams building with AI SDKs struggle with the most once their app is in production.

So far I've heard a few things come up. Some people don't know which model to pick for each task and don't have a week to benchmark everything. Others mentioned costs creeping up but struggling to switch to cheaper models without breaking quality on edge cases.

I'd love to hear what's on your list. If you have 30 seconds, please drop your top 1 or 2 pains in the comments with a bit of context.

reddit.com
u/stosssik — 24 hours ago
▲ 2 r/VibeCodersNest+1 crossposts

What are your biggest pains running AI SDK apps in production?

I'm trying to understand what teams building with AI SDKs struggle with the most once their app is in production.

So far I've heard a few things come up. Some people don't know which model to pick for each task and don't have a week to benchmark everything. Others mentioned costs creeping up but struggling to switch to cheaper models without breaking quality on edge cases.

I'd love to hear what's on your list. If you have 30 seconds, please drop your top 1 or 2 pains in the comments with a bit of context.

u/stosssik — 24 hours ago
▲ 7 r/AutoGPT+5 crossposts

What are your biggest pains running AI SDK apps in production?

I'm trying to understand what teams building with AI SDKs struggle with the most once their app is in production.

So far I've heard a few things come up. Some people don't know which model to pick for each task and don't have a week to benchmark everything. Others mentioned costs creeping up but struggling to switch to cheaper models without breaking quality on edge cases.

I'd love to hear what's on your list. If you have 30 seconds, please drop your top 1 or 2 pains in the comments with a bit of context.

reddit.com
u/stosssik — 1 day ago

Anthropic is limiting OpenClaw again. And honestly, it's just sad.

Every time something interesting emerges in the Claude ecosystem, Anthropic finds a way to throttle it. In April they cut off OpenClaw overnight. Now they're "bringing it back" with capped Agent SDK credits that expire monthly with no rollover, billed at API rates the moment you cross the line. You get to use the engine, but you're not allowed to redline it.

Here's what actually changes on June 15, 2026:

- All programmatic usage (Agent SDK, claude -p, OpenClaw, Zed, custom scripts) moves to a dedicated credit pool, separated from your subscription.
- Pro: $20/month in credits. Max 5x: $100. Max 20x: $200.
- Credits reset monthly. No rollover.
- Once exhausted, programmatic usage stops, or falls back to standard API rates ($3/M input, $15/M output on Sonnet) if you opt into extra usage.

In plain terms: if you were doing anything through OpenClaw, your subscription just lost most of its value.

reddit.com
u/stosssik — 1 day ago
▲ 62 r/Agent_AI+5 crossposts

Run Claude Code on your local Ollama models

You love coding with Claude Code but the bill is rough? You can use it with your ollama local or cloud models now!

Here is the hack: Go to manifest.build and create a Claude Code agent. Manifest gives you a base URL and an API key. Ask your Claude Code to add them to its settings.json file. From now on, every request your Claude Code sends goes through Manifest.

Then, from the Manifest dashboard, connect your Ollama (Cloud or local) and pick which models you want your requests to be routed to.

You keep the agent loop, the skills and the harness of your claude code agent, for free or the price of your subscription!

What you get from this:

  • Stop hitting Claude Code usage limits mid-build
  • Add fallbacks to a frontier model only when something actually needs it
  • Full observability on what runs where
  • Combine it with other subscriptions you're already paying to cut your costs

Manifest is an open source LLM router that gives you full control over how your agent's requests get routed. The goal is to send each request to the right model, reducing your inference costs. It's mostly used for AI SDK Apps, peronal AI agents and coding agents.

It is free and open source. If you try it, please let us a feedback on our Github. Repo: github.com/mnfst/manifest

u/stosssik — 6 days ago
▲ 26 r/AIAgentsInAction+6 crossposts

What's your actual use case with your agent, and which model do you pair it with?

I'm running a benchmark to figure out which models give the best price-to-quality ratio for different tasks. I will publish it once finished. While I crunch the numbers, I'd love to hear from your side:

  1. Your use case
  2. The model you use for it
  3. Why that pairing works for you
u/stosssik — 6 days ago

What's your actual use case with your agent, and which model do you pair it with?

I'm running a benchmark to figure out which models give the best price-to-quality ratio for different tasks. I will publish it once finished. While I crunch the numbers, I'd love to hear from your side:

  1. Your use case
  2. The model you use for it
  3. Why that pairing works for you
reddit.com
u/stosssik — 7 days ago

Minimax just dropped MaxHermes

Hey hello Hermes community. Have you already tried MaxHermes?

It is a zero-setup cloud version of Hermes Agent! As they did with MaxClaw before.

What do you think about that kind of solutions? Hermes is already quite easy to setup they release more stuffs to make it even easier everyday. I'm curious about your thought about it and also your Hermes use cases.

u/stosssik — 12 days ago
▲ 0 r/Slack

Why do companies work on Slack? This tool is unbelievably bad. I've never seen a tool that's such a complete mess.

What the hell is wrong with it? Why don't they hire a UX designer or some developers, even juniors? It couldn't possibly be worse. Why don't they do that? What is it that people actually like about it? You can't even log in. I'm logged in with an email, I try to accept an invite to a space, and it tells me that email doesn't exist. But I'm literally logged in with it on the app.

They say one thing and the opposite. Nothing makes sense, I lose access to channels without understanding why. I genuinely wish this company would die and that there'd be something else, just something that works. That nobody would be forced to use this piece of shit.

reddit.com
u/stosssik — 14 days ago
▲ 70 r/better_claw+2 crossposts

Hey everyone, yesterday I asked which models you use with your agents. About 16 hours later, I got 219 model mentions and 207 upvotes across 109 people who answered.

I classified everything. Each model got 1 point per mention, plus the number of upvotes the comment received.

Most mentioned and upvoted models

  1. Qwen 3.6 — 77 points (27 mentions, 50 upvotes)
  2. Minimax 2.7 — 75 points (21 mentions, 54 upvotes)
  3. Deepseek V4 Flash — 39 points (9 mentions, 30 upvotes)
  4. Kimi K2.6 — 37 points (12 mentions, 25 upvotes)
  5. GLM 5.1 — 31 points (12 mentions, 19 upvotes)
  6. Gemma 4 26b — 27 points (3 mentions, 24 upvotes)
  7. Deepseek V4 Pro — 24 points (11 mentions, 13 upvotes)
  8. GPT 5.5 — 22 points (10 mentions, 12 upvotes)
  9. Qwen 3.5 — 12 points (5 mentions, 7 upvotes)
  10. GPT 5.4 mini — 9 points (3 mentions, 6 upvotes)
  11. Qwen (other versions) — 9 points (5 mentions, 4 upvotes)
  12. Gemini 3.1 Flash — 8 points (3 mentions, 5 upvotes)
  13. GPT-OSS 120b — 7 points (2 mentions, 5 upvotes)
  14. Gemma 4 31b — 6 points (3 mentions, 3 upvotes)
  15. Claude Sonnet 4.6 — 6 points (1 mention, 5 upvotes)
  16. Gemma 4 (unspecified version) — 6 points (2 mentions, 4 upvotes)
  17. GPT 5.4 / Codex 5.4 — 6 points (3 mentions, 3 upvotes)
  18. Gemini 2.5 Flash — 5 points (1 mention, 4 upvotes)
  19. Gemini 3.1 Pro — 5 points (2 mentions, 3 upvotes)
  20. Claude Opus 4.7 — 4 points (2 mentions, 2 upvotes)

Worth noting: Claude was also mentioned 16 times without specifying a version, and GPT, 5 times. I didn't include those in the model ranking since I couldn't attribute them to a specific one, but they're counted in the provider ranking below.

Same data, grouped by provider

  1. Alibaba — 98 points, 37 mentions
  2. DeepSeek — 81 points, 27 mentions
  3. OpenAI — 78 points, 25 mentions
  4. MiniMax — 75 points, 21 mentions
  5. Anthropic — 72 points, 21 mentions
  6. Google — 68 points, 20 mentions
  7. Moonshot AI — 42 points, 14 mentions
  8. Z.ai — 40 points, 16 mentions
  9. xAI — 2 points, 1 mention
  10. Venice AI — 2 points, 1 mention

On routing

I also looked at how many of you described a routing setup, meaning sending different requests to different models. Out of 109 people who answered, 36 (33%) explicitly described one. So roughly 1 in 3 of you felt the need to send different requests to different models.

To take with a grain of salt though: the 67% who mentioned a single model didn't necessarily say they don't route, they just didn't bring it up.

That's it. Posting this after about 16 hours of data, but answers are still coming in, so happy to post an update in a few days if there's interest.

So tell me, does anything in there surprise you?

reddit.com
u/stosssik — 15 days ago
▲ 32 r/ManifestforAI+3 crossposts

Hey everyone, yesterday I asked which models you use with your agents. About 16 hours later, I got 219 model mentions and 207 upvotes across 109 people who answered.

I classified everything. Each model got 1 point per mention, plus the number of upvotes the comment received.

Most mentioned and upvoted models

  1. Qwen 3.6 — 77 points (27 mentions, 50 upvotes)
  2. Minimax 2.7 — 75 points (21 mentions, 54 upvotes)
  3. Deepseek V4 Flash — 39 points (9 mentions, 30 upvotes)
  4. Kimi K2.6 — 37 points (12 mentions, 25 upvotes)
  5. GLM 5.1 — 31 points (12 mentions, 19 upvotes)
  6. Gemma 4 26b — 27 points (3 mentions, 24 upvotes)
  7. Deepseek V4 Pro — 24 points (11 mentions, 13 upvotes)
  8. GPT 5.5 — 22 points (10 mentions, 12 upvotes)
  9. Qwen 3.5 — 12 points (5 mentions, 7 upvotes)
  10. GPT 5.4 mini — 9 points (3 mentions, 6 upvotes)
  11. Qwen (other versions) — 9 points (5 mentions, 4 upvotes)
  12. Gemini 3.1 Flash — 8 points (3 mentions, 5 upvotes)
  13. GPT-OSS 120b — 7 points (2 mentions, 5 upvotes)
  14. Gemma 4 31b — 6 points (3 mentions, 3 upvotes)
  15. Claude Sonnet 4.6 — 6 points (1 mention, 5 upvotes)
  16. Gemma 4 (unspecified version) — 6 points (2 mentions, 4 upvotes)
  17. GPT 5.4 / Codex 5.4 — 6 points (3 mentions, 3 upvotes)
  18. Gemini 2.5 Flash — 5 points (1 mention, 4 upvotes)
  19. Gemini 3.1 Pro — 5 points (2 mentions, 3 upvotes)
  20. Claude Opus 4.7 — 4 points (2 mentions, 2 upvotes)

Worth noting: Claude was also mentioned 16 times without specifying a version, and GPT, 5 times. I didn't include those in the model ranking since I couldn't attribute them to a specific one, but they're counted in the provider ranking below.

Same data, grouped by provider

  1. Alibaba — 98 points, 37 mentions
  2. DeepSeek — 81 points, 27 mentions
  3. OpenAI — 78 points, 25 mentions
  4. MiniMax — 75 points, 21 mentions
  5. Anthropic — 72 points, 21 mentions
  6. Google — 68 points, 20 mentions
  7. Moonshot AI — 42 points, 14 mentions
  8. Z.ai — 40 points, 16 mentions
  9. xAI — 2 points, 1 mention
  10. Venice AI — 2 points, 1 mention

On routing

I also looked at how many of you described a routing setup, meaning sending different requests to different models. Out of 109 people who answered, 36 (33%) explicitly described one. So roughly 1 in 3 of you felt the need to send different requests to different models.

To take with a grain of salt though: the 67% who mentioned a single model didn't necessarily say they don't route, they just didn't bring it up.

That's it. Posting this after about 16 hours of data, but answers are still coming in, so happy to post an update in a few days if there's interest.

So tell me, does anything in there surprise you?

u/stosssik — 14 days ago
▲ 99 r/Qwen_AI+1 crossposts

Hey everyone, yesterday I asked which models you use with your agents. About 16 hours later, I got 219 model mentions and 207 upvotes across 109 people who answered.

I classified everything. Each model got 1 point per mention, plus the number of upvotes the comment received.

Most mentioned and upvoted models

  1. Qwen 3.6 — 77 points (27 mentions, 50 upvotes)
  2. Minimax 2.7 — 75 points (21 mentions, 54 upvotes)
  3. Deepseek V4 Flash — 39 points (9 mentions, 30 upvotes)
  4. Kimi K2.6 — 37 points (12 mentions, 25 upvotes)
  5. GLM 5.1 — 31 points (12 mentions, 19 upvotes)
  6. Gemma 4 26b — 27 points (3 mentions, 24 upvotes)
  7. Deepseek V4 Pro — 24 points (11 mentions, 13 upvotes)
  8. GPT 5.5 — 22 points (10 mentions, 12 upvotes)
  9. Qwen 3.5 — 12 points (5 mentions, 7 upvotes)
  10. GPT 5.4 mini — 9 points (3 mentions, 6 upvotes)
  11. Qwen (other versions) — 9 points (5 mentions, 4 upvotes)
  12. Gemini 3.1 Flash — 8 points (3 mentions, 5 upvotes)
  13. GPT-OSS 120b — 7 points (2 mentions, 5 upvotes)
  14. Gemma 4 31b — 6 points (3 mentions, 3 upvotes)
  15. Claude Sonnet 4.6 — 6 points (1 mention, 5 upvotes)
  16. Gemma 4 (unspecified version) — 6 points (2 mentions, 4 upvotes)
  17. GPT 5.4 / Codex 5.4 — 6 points (3 mentions, 3 upvotes)
  18. Gemini 2.5 Flash — 5 points (1 mention, 4 upvotes)
  19. Gemini 3.1 Pro — 5 points (2 mentions, 3 upvotes)
  20. Claude Opus 4.7 — 4 points (2 mentions, 2 upvotes)

Worth noting: Claude was also mentioned 16 times without specifying a version, and GPT, 5 times. I didn't include those in the model ranking since I couldn't attribute them to a specific one, but they're counted in the provider ranking below.

Same data, grouped by provider

  1. Alibaba — 98 points, 37 mentions
  2. DeepSeek — 81 points, 27 mentions
  3. OpenAI — 78 points, 25 mentions
  4. MiniMax — 75 points, 21 mentions
  5. Anthropic — 72 points, 21 mentions
  6. Google — 68 points, 20 mentions
  7. Moonshot AI — 42 points, 14 mentions
  8. Z.ai — 40 points, 16 mentions
  9. xAI — 2 points, 1 mention
  10. Venice AI — 2 points, 1 mention

On routing

I also looked at how many of you described a routing setup, meaning sending different requests to different models. Out of 109 people who answered, 36 (33%) explicitly described one. So roughly 1 in 3 of you felt the need to send different requests to different models.

To take with a grain of salt though: the 67% who mentioned a single model didn't necessarily say they don't route, they just didn't bring it up.

That's it. Posting this after about 16 hours of data, but answers are still coming in, so happy to post an update in a few days if there's interest.

So tell me, does anything in there surprise you?

u/stosssik — 15 days ago
▲ 119 r/ManifestforAI+3 crossposts

More and more of us are looking for a solid replacement to Anthropic. What are you using now?
The top 8 I'm seeing today talking with OpenClaw users:

  1. GPT-5.5
  2. MiniMax M2.7
  3. GLM 5.1
  4. Qwen3.6 Plus
  5. Gemini 3.1
  6. Kimi K2.6
  7. Nemotron 3 Ultra
  8. GPT-5.4-mini

. What's working for you and what did you try that didn't?

u/stosssik — 16 days ago

More and more of us are looking for a solid replacement to Anthropic. What are you using now?
The top 8 I'm seeing today talking with OpenClaw users:

  1. GPT-5.5
  2. MiniMax M2.7
  3. GLM 5.1
  4. Qwen3.6 Plus
  5. Gemini 3.1
  6. Kimi K2.6
  7. Nemotron 3 Ultra
  8. GPT-5.4-mini

. What's working for you and what did you try that didn't?

reddit.com
u/stosssik — 16 days ago
▲ 2 r/ManifestforAI+1 crossposts

Yesterday Sam Altman posted that you can sign in to OpenClaw with your ChatGPT account and use your subscription there.

So you can run openclaw onboard, choose openai-codex and sign in with your ChatGPT account through OAuth. OpenClaw then uses your subscription to access Codex. Your Plus at $20/mo or Pro at $100/mo covers everything at a flat rate.

This goes in the opposite direction of what Anthropic has been doing. They've made it harder and harder to use Claude through OpenClaw over the past few months, between ToS updates and OAuth restrictions (Their updated ToS says OAuth tokens are "intended exclusively for Claude Code and Claude.ai").

Looking at how well Codex has been received lately, I think most personal agent users are going to make the switch without looking back.

Where do you stand on this? Have you already moved to Codex? Are you thinking about it? If you switched, how does it compare to Claude so far?

u/stosssik — 17 days ago
▲ 126 r/openclaw

Yesterday Sam Altman posted that you can sign in to OpenClaw with your ChatGPT account and use your subscription there.

So you can run openclaw onboard, choose openai-codex and sign in with your ChatGPT account through OAuth. OpenClaw then uses your subscription to access Codex. Your Plus at $20/mo or Pro at $100/mo covers everything at a flat rate.

This goes in the opposite direction of what Anthropic has been doing. They've made it harder and harder to use Claude through OpenClaw over the past few months, between ToS updates and OAuth restrictions (Their updated ToS says OAuth tokens are "intended exclusively for Claude Code and Claude.ai").

Looking at how well Codex has been received lately, I think most personal agent users are going to make the switch without looking back.

Where do you stand on this? Have you already moved to Codex? Are you thinking about it? If you switched, how does it compare to Claude so far?

reddit.com
u/stosssik — 17 days ago

DeepSeek V4 Pro and Flash are now available in Manifest Router.

You can route your agent's coding and reasoning requests to them while keeping cheaper models for simple tasks.

DeepSeek V4 scores well on coding benchmarks and costs a fraction of Opus 4.6-max or GPT-5.4.

You can also use it as a fallback. If your primary model goes down or hits rate limits, Manifest falls back to DeepSeek automatically. Setup takes a minute. Go to manifest, connect DeepSeek as a provider with your API key, and assign it to the tiers you want.

For those who don't know Manifest, it is a free and open-source LLM router that gives you full control over how your agent's requests get routed, reducing your inference costs by up to 70%. Try it here: https://github.com/mnfst/manifest

Enjoy! 🦚❤️🐋

u/stosssik — 19 days ago

If you're running models locally, you already know your setup handles simple tasks fine. Chat, summaries, classification, quick answers. No reason to send those to Opus and pay for it.

We just shipped llama.cpp and LM Studio as providers in Manifest. You connect your local server, assign it to the tiers you want, and Manifest sends the right requests there. For heavier tasks like reasoning or complex tool calling, you can route them to whatever cloud provider you prefer.

A lot of agent owners have been asking us to support these so they can handle simple tasks, coding with models like qwen3-coder, or recurring jobs locally, and keep cloud models as fallbacks or for the rest.

So we shipped it!

If you haven't heard of Manifest yet, it's a free and open-source LLM router that gives you full control over how your agent's requests get routed. We're on a mission to drastically cut inference costs.

Try it here: https://github.com/mnfst/manifest. And if you do, give us your honest feedback. We want to focus on what users need so your feedback means a lot to us.

u/stosssik — 19 days ago

If you're running LM Studio, you already know your local model handles simple tasks fine. Chat, summaries, classification, quick answers. No reason to send those to Opus and pay for it.

We just shipped LM Studio as a provider in Manifest. You connect your local server, assign it to the tiers you want, and Manifest sends the right requests there. For heavier tasks like reasoning or complex tool calling, you can route them to whatever cloud provider you prefer.

A lot of OpenClaw users have been asking us to support LM Studio so they can handle simple tasks, coding with models like qwen3-coder-next, or recurring jobs locally, and keep cloud models as fallbacks or for the rest.

So we shipped it!

For those of you who spent the last few weeks in a cave, 😜 Manifest is a free and open-source LLM router that gives you full control over how your agent's requests get routed.

Our mission is to cut drastically your inference costs!

Try it here: https://github.com/mnfst/manifest. And if you do, give us your honest feedback. We want to focus on what users need so your feedbacks mean a lot for us.

u/stosssik — 19 days ago

Hey! We added LM Studio support on the local version of Manifest. You can now route between multiple local models served by LM Studio, alongside Ollama.

Some of you have been asking for more local providers, so this is a step in that direction.

If you run into setup issues or have feedback, drop by our Discord: https://discord.gg/FepAked3W7

Enjoy!

u/stosssik — 20 days ago