u/Askmasr_mod

specs :

core i5 14400F

32gb ram d4 3200mhz

rtx 4060

current speeds

30tps in output

500 tps in prefill

command i currently use

.\llama-server.exe `

>> -m "H:\model\unsloth\Qwen3.6-35B-A3B-GGUF\Qwen3.6-35B-A3B-UD-Q4_K_XL.gguf" `

>> --host 0.0.0.0 --port 8080 `

>> --alias "claude-sonnet-4-5" `

>> -ngl 999 `

>> --n-cpu-moe 36 `

>> -c 65535 `

>> -b 4096 `

>> -ub 2048 `

>> -t 6 `

>> -tb 10 `

>> --cont-batching `

>> --mlock `

>> -ctk turbo4 -ctv turbo3 `

>> -fa on `

>> --jinja `

>> --warmup `

>> --perf `

current usage

https://preview.redd.it/pnrdj1otqszg1.png?width=1920&format=png&auto=webp&s=3e7c25d96c1286f12ca328bb0da7b967316d312e

reddit.com
u/Askmasr_mod — 6 days ago