u/KempynckXPS13

I put together this plan on what I think could be useful for my localLLM wishes. So I basically want to achieve this goal:

Build an always-on, desk-resident machine that:

Runs a 30B-class dense LLM (Qwen3.6 27B MoE) locally, fully offline, for agentic tasks very smooth (at decently high token/s >20t/s and low TTFT at 50K context <5min)
Is accessible from a Windows laptop over SSH and a REST API from anywhere, at home on the local network or travelling, via Tailscale
Doubles as a file server: stores documents and makes them available both to the agent and to Windows File Explorer as a mapped network drive
Stays around ~€2,000-3000 total cost
Allows to pass of an agentic task through Pi/OpenCode agent harness and I get pinged on Slack when the task is completed

The main concerns I have with this

How mature is ROCm for GPU computation for LLM use? AMD's focus has always been on gaming, rather than LLM community.
This released early 2025 which is quite a while ago. Is anyone aware of new releases planned for near future that may be worthwhile to wait for?

What are your thoughts on this set-up?