u/FilterJoe

How do you deal with gemma 4 31b quality variance by provider?

CONTEXT:

Fairly new to Hermes, though I fiddled some with llama.cpp and attempting to make to make python do stuff with my local model 1-2 years ago on my Apple Mac Mini M2 Pro 16gb. I'm running Hermes on VMware Debian VM on my Mac.

I plan to get a more capable Mac later this year but my mini obviously isn't enough to run Qwen 27b or Gemma 4 31b locally so I'm using openrouter to start, and will only be experimenting with models capable of running with ample context on a 128GB RAM Apple Studio (with M5, when it comes out).

openrouter currently has gemma 4 31b for free, provided by Google AI studio, though it is severely rate limited. I put in a token I got from playing around with AI studio a year ago and it was working terrifically. Then my token abruptly quit working.

Didn't realize that AI Studio no longer allows free use of the token unless you are a professional developer. I'm not. I guess they algorithmically detected non-developer behavior. So I deleted the token, bought a small number of credits, and set Gemma 4 31b paid as a backup to the free.

Now routing me to DeepInfra provider for Gemma 4 31b calls.

THE ISSUE:

Gemma 4 31b quality on openrouter takes a big hit when not served by AI Studio. Hermes now making lots of mistakes, sometimes telling me it did something when it didn't, and using more calls to get the job done. If this had been my first experience with Hermes, I would have very quickly given up using Gemma 4 31b. And . . . each request I make of Hermes seems to eat up 2-3 cents. It was so much better before when served by Google AI Studio.

openrouter claims 8k quant for DeepInfra on Gemma 4 31b. Really? Or maybe it's correct, but it's not configured quite right at DeepInfra.

Have any of you run into similar issues? If so:

Do you try other openrouter providers? If so, how do you specify that in Hermes configuration?

Has anyone found specific Gemma 4 31b provider that is equal to Google AI Studio?

Do you have any other suggestions?

reddit.com
u/FilterJoe — 5 days ago