A working model name and compose config would be much appreciated. Also what numbers are you getting you of it. I ran the lukealonso/GLM-5.1-NVFP4 few days ago but it was only 1k PP and 33 tps gen. Tried newer docker images and it would just start using 100% gpus by itself after startup and won't quit saw people posting tps in ~100 ranges.
Hardware: RTX 6000 pros
Thank you in advance!
u/val_in_tech — 6 days ago