r/unsloth

🔥 Hot ▲ 110 r/unsloth

Gemma 4 E4B (4-bit) executes Bash code and tool calls locally on 6GB RAM.

Hey guys just wanted to showcase another cool use-case of Gemma 4 E4B (4-bit GGUF) to showcase how powerful it is.

It completed a full repo audit by executing Bash code and tool calls locally. Runs on just 6GB RAM. It inspected files, git history, cross-checked metrics, and showcased evidence-backed candidates.

Try it via Unsloth Studio for self-healing toolcalling: https://github.com/unslothai/unsloth

Gemma 4 guide: https://unsloth.ai/docs/models/gemma-4

Let us know if you have any issues with the model btw, I know some of you had tokenizer issues which got fixed in llama.cpp so we're reuploading. Some also experienced gibberish but unsure where this is stemming from.

u/yoracale — 8 hours ago
Unsloth Studio Gemma-4 update - faster precompiled binaries
▲ 19 r/unsloth

Unsloth Studio Gemma-4 update - faster precompiled binaries

We just updated Unsloth Studio!

  1. Pre-compiled binaries for llama.cpp including the below 2 Gemma-4 fixes:
  2. Pre-compiled binaries for Windows, Linux, Mac, WSL devices - CPU and GPU
  3. Gemma-4 31B and 2B are re-converted - doing the rest now
  4. Tool Calling more robust
  5. Speculative Decoding added for non vision models (Gemma-4 is vision sadly and Qwen3.5)

To update:

macOS, Linux, WSL:

curl -fsSL https://unsloth.ai/install.sh | sh

Windows:

irm https://unsloth.ai/install.ps1 | iex

Launch

unsloth studio -H 0.0.0.0 -p 8888
u/danielhanchen — 6 hours ago

Llama.cpp fails to update when updating Unsloth Studio

Hi there !

Yesterday I downloaded Unsloth Studio for the first time on my Windows PC to try out Gemma 4 !
I did run into trouble running that model at first though, as apparently I was a bit early to the party and llama.cpp hadn't implemented support for it yet.
1 Hour later, they released a new version with support for Gemma 4, so I updated Unsloth Studio, only to find that it didn't update llama.cpp.

Turns out I had to manually remove llama.cpp from my unsloth folder, so it would build a new one from scratch the next time I updated it.

After that all worked fine, but I heard today llama.cpp has implemented some improvements to the Gemma 4 support, so I wanted to update again, and once again running 'update unsloth studio' in Powershell does not update llama.cpp.

I am getting this error though:

>Failed to resolve a published llama.cpp release via ggml-org/llama.cpp

> | [llama-prebuilt] fatal helper error: HTTP Error 422: Unprocessable Entity

>

>Resolved llama.cpp release tag: b8660

>

>installing prebuilt llama.cpp bundle (preferred path)...

>Existing llama.cpp install detected -- validating staged prebuilt update before replacement

>Skipping prebuilt install because prebuilt tag resolution failed -- falling back to source build

>OpenSSL dev found at C:\Program Files\OpenSSL-Win64

It seems my unsloth is unable to download llama.cpp builds ? How is that happenning ?

If somebody could help me out with this I'd really appreciate it.

Thanks !!

reddit.com
u/FoxTrotte — 5 hours ago

Unsloth Studio Radeon 5700 XT

I have an older 5700 XT card with 8gb of VRAM based on AMD's RDNA 1 architecture.

I read in a separate post on this subreddit that you're working with AMD for Unsloth Studio support.

I have a feeling you won't target my GPU but do you have any idea if I'll be able to get it working?

I'm a student who wants to experiment with Local AI training.

Thank you!

reddit.com
u/Sure-Two-5672 — 4 hours ago

personal tool calls in unsloth studio

Hi, is there already a way or will it be possible in the feature to create and upload personal tools? For example adding a weather api tool?

thanks

reddit.com
u/Fun_Librarian_7699 — 9 hours ago

reasoning focused models and tools worth trying when you need verifiable accuracy, not just fluent output

I've been spending the last few months fine tuning smaller models for a financial compliance project where getting things wrong has actual regulatory consequences. The standard approach of throwing GPT 5 or Sonnet 4.6 at a complex multi step problem and hoping the output is correct just doesn't cut it when you're dealing with audit trails and chain of custody for reasoning.

I wanted to share a few tools and approaches I've been evaluating for tasks where factual correctness and step by step verification matter more than response speed or conversational polish. This is specifically for people working on research, legal, finance, or engineering problems where you need to trace why the model arrived at an answer, not just get a plausible sounding one.

Before diving in, here's how I'd map these five approaches on two axes that actually matter for high stakes work — how deep the verification goes, and how much engineering effort you need to get there:

  Engineering                                                        
  Effort  ▲                                                          
          │                                                          
    High  │   ④ Custom RAG                                           
          │      + Citation Verify                                   
          │                                                          
          │                                                          
          │   ① Qwen 3.5              ② MiroMind                    
    Med   │      + Unsloth                (DAG verification          
          │      (fine-tune)               built in)                 
          │                                                          
          ├──────────────────────┬──────────────────────▶             
          │                     │              Verification Depth    
          │                     │                                    
    Low   │   ⑤ GLM 4.6        │  ③ Kimi K2                        
          │      (multilingual) │     ext. thinking                  
          │                     │                                    
          └─────────────────────┘                                    
              Shallow                    Deep                        

Here's what I've been testing:

  1. Fine tuned Qwen 3.5 (via Unsloth) — For domain specific reasoning, nothing beats having a model trained on your own data. I've been using Unsloth to fine tune Qwen 3.5 27B for regulatory document analysis and the results are solid, especially for structured extraction tasks. The 2x speedup and lower VRAM requirements make iteration much faster. If your accuracy problem is domain specificity, this is the move.
  2. MiroMind (MiroThinker) — This one is interesting and quite different from the usual suspects. It's a 235B parameter model built around what they call DAG reasoning, where instead of a linear chain of thought, the system branches into parallel reasoning paths, verifies each step, and can rollback to a verified state if something breaks. The whole architecture is verification centric rather than fluency optimized. I've been testing it on multi step financial forecasting queries and the reasoning traces are genuinely useful for audit purposes. Free tier gives you 100 credits per day, Pro is $19/month. Worth noting their benchmarks come from their own published materials, so take the specific numbers with appropriate skepticism until independent evaluations catch up.
  3. Kimi K2 with extended thinking — Decent for long context research synthesis. The context window is generous and the reasoning mode produces better structured outputs than the base model. Falls short on tasks requiring genuine multi step verification though.
  4. Custom RAG pipeline with citation verification — For anyone doing deep research, building a retrieval pipeline that forces the model to cite sources and then programmatically verifying those citations exist and say what the model claims they say. More engineering effort but the accuracy improvement is dramatic.
  5. GLM 4.6 for multilingual reasoning — If you're working across languages (especially CJK), GLM 4.6 handles cross lingual reasoning tasks better than most alternatives I've tested.

The broader point: for high stakes work, the question isn't "which model is smartest" but "which system lets me verify the reasoning chain and catch errors before they become expensive." Fine tuning with Unsloth gives you domain control, dedicated reasoning systems give you verification infrastructure, and custom pipelines give you citation accountability.

Curious what setups others here are running for tasks where accuracy is non negotiable, especially anyone combining fine tuned local models with external verification layers.

reddit.com
u/Dramatic_Spirit_8436 — 19 hours ago
[Question] How to use a local model as a Provider for Recipes in Unsloth Studio?

[Question] How to use a local model as a Provider for Recipes in Unsloth Studio?

I have a local model running in the Chat tab of Unsloth Studio. I want to use this same model as a Provider for Recipes to process CSV/JSON data.

Since there is no dedicated "Local Unsloth" option in the Provider settings, what is the correct way to manually configure a connection to the local model?

The model works perfectly in the Chat UI, but I need to expose it as a selectable Provider for automated steps. Any help with the manual setup?

https://preview.redd.it/q906qcbhzzsg1.png?width=1294&format=png&auto=webp&s=f9eb0c50adf041fc8fae074ae263bebd4136c11e

reddit.com
u/Inflation_Artistic — 12 hours ago
Week