u/DeutscheGent — reddlx

So I’ll be the first to admit that I have a bit of an unusual setup. I’m running a Spark DGX GB10 running llama.cpp, vLLM, or Olamma (depends on the day and bid I get stuck debugging vLLM).

At first I ran Openclaw on the DGX, but after waking up a couple of times to borked systems, I decided to move both it and Hermes over to an my Unraid box, which is a slightly older but decent Xeon 2186g. To keep process isolation in place, each VM has 6 cores and 16GB or RAM. I’ve tested both pinning the VMs to a set of cores of letting them have free rein. I’ve also tested in docker.

Hermes runs great… I can see it immediately send the query over to the DGX and see the GPU start to get hammered, which is what I would expect. Hermes plays nice whether it is on a dedicated VM or on Docker on Unraid. Both perform nearly identically.

When I do the same exact thing in OpenClaw, I see an immediate CPU spike on the VM from anywhere between 20-40 seconds before it ever send the query to the DGX. I’m running a stripped down install to keep it lean, I’ve played with a number of models, and the heavy CPU utilization is the same. I’ve had both Hermes and Openclaw evaluate the installation, etc. and both say things look great and everything is healthy. I’ve also tried running on Docker and at least on Unraid it is a bit of a mess and not functional as the template maintainer isn’t staying current on it. Hence the VM approach.

So the question to this group… is this expected behavior with OpenClaw relative to Hermes? I’m relatively new to this and don’t have much of a baseline. I first installed back on 4-21 and that was directly on the DGX before recently moving it. Is this something happening in newer builds only?

Any insights anyone can provide would be helpful.