u/Tastetrykker

Hi! Just checking, am I the only one who has serious issues with Gemma 4 locally?

I've played around with Gemma 4 using Unsloth quants on llama.cpp, and it's seriously broken. I'm using the latest changes from llama.cpp, along with the reccomended temperature, top-p and top-k.

Giving it an article and asking it to list all typos along with the corrected version gives total nonsense. Here is a random news article I tested it with: https://www.bbc.com/news/articles/ce843ge47z4o

I've tried the 26B MoE, I've tried the 31B, and I've tried UD-Q8_K_XL, Q8_0, and UD-Q4_K_XL. They all have the same issue.

As a control, I tested the same thing in Google AI Studio, and there the models work great, finding actual typos instead of the nonsense I get locally.

Gemma 4 is seriously broken when using Unsloth and llama.cpp