u/-elmuz-

▲ 19

vLLM on Arc B70

Anyone has that card? I am interested given that price and the available memory. I am aware that speed wouldn't be comparable with Nvidia competitor (cheapest 32GB should be RTX PRO 4500, roughly 3 times its price).

If anyone has it, can you share some benchmark? Which quantization dtype are supported by that card? What's the experience in general in terms of features? Is it everything so experimental that chances things are not working are high?

reddit.com
u/-elmuz- — 5 days ago
▲ 13

Hey, I in order to double my VRAM capacity I am considering two options: buying a single new GPU with twice the VRAM or by another identical to the current and leverage TP or PP.

Let's focus on the TP/PP. I am wondering how much PCIe speed penalizes overall speed. Is anyone capable of providing some rule of thumb or point me to any trusted benchmark where we can see for example the throughput in different configurations? E.g.:

  • Single GPU (here I guess here PCIe generation/speed does not matter much)
  • PP on 2 GPU (I guess also here PCIe generation/speed does not matter much)
  • TP PCIe 5.0 16x/16x
  • TP PCIe 5.0 8x/8x (I guess this should be equivalent to PCIe 4.0 16x/16x)
  • TP PCIe 4.0 8x/8x

Any feedback/real experience would be appreciated. I could share my specific alternatives, but I am more interested in general numbers.

reddit.com
u/-elmuz- — 13 days ago