u/giuliastro — reddlx

Hello,

I am a new user and would like to configure this workstation at its best for local inferences.

I am trying LM Studio, loading Qwen3.6 27B q4 but it doesn't go beyond 11 token/s.

I tried using Vulkan or Rocm, lowering VRAM in Bios or setting it at highest, set to no reasoning, tried playing with context window and other parameters but I don't seem to go faster than 11/12 token per second which is unusable for coding.

I know I should use Linux, but my question is: are these speeds quite normal for this machine? I really thought it was faster for local inferences.

Thank you in advance.