u/Electrical-Ebb4002

▲ 13 r/unsloth+2 crossposts

[Question] Fine-tuning Gemma 4 Vision in Unsloth Studio for Medical Image Classification

Hi everyone,

I'm planning to fine-tune Gemma 4 (specifically for medical image classification/species identification) using Unsloth Studio.

My current dataset is a simple table: one column with the image and one column with the species name (label). However, I’ve noticed that Unsloth Studio’s UI doesn't seem to have a dedicated field to define the "input text prompt" (e.g., "What species is in this image?") when loading a custom dataset.

My Questions:

  1. How should I reformat my image + label dataset so Unsloth Studio recognizes it correctly for multimodal training?
  2. Do I need to convert my data into a ChatML-style messages format before uploading?
  3. Does the "instruction" need to be a hardcoded column in my CSV/Parquet file for every single row?

Setup:

  • Model: Gemma 4 (E2B or E4B)
  • Task: Medical Image Classification (Microscopic images)
  • Environment: Unsloth Studio (Local/RunPod)

Any advice on the specific dataset schema required for the Studio would be greatly appreciated!

reddit.com
u/Electrical-Ebb4002 — 18 hours ago