u/Current-Interest-369

Price model could have been like early days dial-up-modem - e.g. PRU per /10 minute runtime.

In relation to the latest shock where everyone is realizing/seeing whatever price in range the future might hold, I have as well been wondering about the price model.

One of the main of problems with the pricing model in AI is the lack of transparency for the enduser. There is always the discussion about usage allowance by Anthropic and OpenAI, because it shifts over months/weeks/days - not including the random resets / double allowance / etc.

The industry has come up with the token consumption as the primary, and while “a token” is a hard truth it is still quite unpredictable for the end user if the next phase of work is going to be 250.000 tokens or 1.000.000 tokens.
Each new model has a different token-behavior (tokenization is not uniform).

That got me thinking about the early days of internet connections. The dial up internet was charged by the minute and the speed was somewhat predictable. If a website was slow - you avoided it and navigated towards something which faster. The price was predictable because you paid in per something which was transparent to the enduser.

Transparency in pricing means adoption becomes better, and the reason MS choose the Premium Request Units(PRU), must initially have been due to promoting adoption.

The thought is they could have kept the PRU, but set a time limit on each PRU consumed - with possible discount according to current usage loads on the network.

If a user set it to auto and left - the PRUs would run out after xx hours.

This would nudge people to optimize, but in a way - which most people could comprehend.

It would as well let GH/MS have a very specific estimation of how much capacity theur current subscription base actually required to support.

Just my 2 cents - The new 100% token based pricing model, will at least force most people away from the Github Copilot Agent solution.

reddit.com