u/userlord124

Hi everyone! 🦞

I’m currently setting up OpenClaw on a VM (Ubuntu) and I’m hitting a bit of a wall with response times and timeouts. I’m hoping to get some recommendations on the best LLM or configuration for my specific hardware.

My Setup:

Environment: Virtual Machine (VM) accessed via Tailscale.
CPU: 8 Cores
RAM: 16GB.
GPU: None (Pure CPU inference).
Model Provider: Ollama (local).
Primary Channel: Telegram.

The Issue: When I run a 7B parameter model (like Qwen 2.5 or Mistral) directly through the Ollama CLI (ollama run), it actually performs quite well—it’s fast enough for my needs. However, as soon as I bridge it through OpenClaw, everything slows down or stops.

I often get stuck in "conjuring" or "moseying" states in the TUI, and the Telegram bot usually times out before receiving the first token. I've tried dropping down to 1.5B models, but I'm still seeing "unknown model" errors or long delays that I don't get in standalone Ollama.

What I'm looking for:

Model Recommendations: Which model (3B, 7B, or others) is the "sweet spot" for 8 CPU cores through OpenClaw?
Config Tweaks: Are there specific requestTimeout or contextWindow settings you'd recommend for CPU-only setups to prevent OpenClaw from giving up on the model?
IronClaw vs. OpenClaw: Given my hardware, should I be looking at the IronClaw version for better performance?

Note: I am strictly looking for a local-only solution. I don’t want to use Gemini, Groq, or other cloud APIs because the rate limits on free tiers are a dealbreaker for me, and I’m not looking to pay for a subscription right now.

Any advice on how to make OpenClaw "patient" enough for CPU inference or which lightweight models handle agents/tools better would be greatly appreciated!

Thanks in advance!

[Help] Optimizing OpenClaw for a CPU-only VM (8 Cores/16GB RAM) - Ollama works, but OpenClaw times out.