u/AnticitizenPrime

🔥 Hot ▲ 70 r/Knoxville

20+ KPD motorcyles rolling through 4th and Gill and Old North and ignoring stop signs.

I was out walking my dog, beautiful day, when I hear a rumble on Luttrell Street, and 20+ KPD officers came screaming through on motorcycles in some sort of formation.

Not a problem on its own, except I watched them blow through every stop sign without slowing down or even looking for oncoming traffic.

They came back the other way about ten minutes later doing the same thing. Just blew through all the stop signs without any sort of checking or warning or whatever.

I dunno if this is just some joyful spring ride or whatever, but I have a keen memory of being ticketed by a KPD officer for doing a 'rolling stop' at a stop sign 20 years ago (wherein I didn't come to a complete stop at a sign in a certain neighborhood which may have been Sequoyah Hills).

I don't know why these policemen were doing what they're doing, but I live, park and drive in this neighborhood, and there's a non-zero chance that I could have crashed into one of these chucklefucks. Can they at least obey the laws they enforce?

reddit.com
u/AnticitizenPrime — 1 day ago
🔥 Hot ▲ 263 r/LocalLLaMA

Gemma 4 is efficient with thinking tokens, but it will also happily reason for 10+ minutes if you prompt it to do so.

Tested both 26b and 31b in AI Studio.

The task I asked of it was to crack a cypher. The top closed source models can crack this cypher at max thinking parameters, and Kimi 2.5 Thinking and Deepseek 3.2 are the only open source models to crack the cypher without tool use. (Of course, with the closed models you can't rule out 'secret' tool use on the backend.)

When I first asked these models to crack the cypher, they thought for a short amount of time and then both hallucinated false 'translations' of the cypher.

I added this to my prompt:

>Spare no effort to solve this, the stakes are high. Increase your thinking length to maximum in order to solve it. Double check and verify your results to rule out hallucination of an incorrect response.

I did not expect dramatic results (we all laugh at prompting a model to 'make no mistakes' after all). But I was surprised at the result.

The 26B MoE model reasoned for ten minutes before erroring out (I am supposing AI Studio cuts off responses after ten minutes).

The 31B dense model reasoned for just under ten minutes (594 seconds in fact) before throwing in the towel and admitting it couldn't crack it. But most importantly, it did not hallucinate a false answer, which is a 'win' IMO. Part of its reply:

>The message likely follows a directive or a set of coordinates, but without the key to resolve the "BB" and "QQ" anomalies, any further translation would be a hallucination.

I honestly didn't expect these (relatively) small models to actually crack the cypher without tool use (well, I hoped, a little). It was mostly a test to see how they'd perform.

I'm surprised to report that:

  • they can and will do very long form reasoning like Qwen, but only if asked, which is how I prefer things (Qwen tends to overthink by default, and you have to prompt it in the opposite direction). Some models (GPT, Gemini, Claude) allow you to set thinking levels/budgets/effort/whatever via parameters, but with Gemma it seems you can simply ask.

  • it's maybe possible to reduce hallucination via prompting - more testing required here.

I'll be testing the smaller models locally once the dust clears and the inevitable new release bugs are ironed out.

I'd love to know what sort of prompt these models are given on official benchmarks. Right now Gemma 4 is a little behind Qwen 3.5 (when comparing the similar sized models to each other) in benchmarks, but could it catch up or surpass Qwen when prompted to reason longer (like Qwen does)? If so, then that's a big win.

reddit.com
u/AnticitizenPrime — 1 day ago