u/I-will-allow-it

I recently connected an R9700 to my Strix Halo. On Fedora 44 it was very easy. iGPU is rendering the OS to save vram in the R9700. I am using llama.cpp toolbox for the iGPU and using HIP_Visible_Devices to target the right gpu. The R9700 feels lightning fast, speed does fluctuate, but Qwen3.6 35B q4-k-m PP 2100 and TG 87.

Some possible uses would be have a big slow 27B on the iGPU to create plans and perform reviews and have the fast R9700 execute the plans. You could assign different agents to separate GPUs and work concurrently without any slowdown. If you need someone to talk to you can still load a chat model on the NPU to keep you busy while your agents work.

There isn’t much option as far as I know for software to take advantage of this set up, but I’ll start with Open-Notebook and see what else I can find. Send me any ideas you have for software or workflow.

Strix Halo plus R9700 eGPU, Fedora 44. Best of both worlds.