u/CryptographerTop4354 — reddlx

rx6700xt 12gb and 32gb ram, i get around 28 tps on lm studio with qwen 3.6 35ba3b and 3-4tps with qwen 3.5 27b... hopefully dflash and turboquant are added in llamacpp soon.

also if you guys know any other methods i can run models on windows in a faster way please tell, kobold is trash and likelovewant rocm libraries have only been useful for comfyui and other setups but not for lm studio etc. atleast not on windows.
and i cant afford to change my whole setup to ubuntu just for some ai stuff yet.