u/ICanSeeYou7867

I work for a company where cloud services of any kind are very hard to approve. We also are not allowed to run Chinese models.

I have a gpu server with 4x H100 GPUs that I'm running a a kubernetes node. I gleefully began converting some of my other models to nvfp4 to save vram and make way to allocating 2xH100 for this 128GB dense model... until I read the license...

So it seems this is a publicity stunt. So this model can only be ran by businesses that make <$20M per month in revenue. So a very simplified breakdown:

- Individuals... unified ram systems are great, those ~100B parameters MOE models shine here. But a 128GB dense model is gong to be slow...

- Small companies probably dont have a large IT group, and cloud offerings look very attractive. The heat, power requirements, etc..., probably means that there won't be a ton of these companies running this model.

- large companies - can't run it.

So, unfortunately I don't see a lot of people running this model..

EDIT For those of you all saying a big company should pay, and it's fair, I dont disagree with you. But these models turn over monthly. I would think that most companies would opt for the cloud pay as you go pricing model at that point than go through the process of building, approving and issues purchase orders for being able to run a model locally for an annual or monthly bill.

Let me know if you are a big company that would be going through this process to use it locally instead of the cloud.

reddit.com
u/ICanSeeYou7867 — 14 days ago