[Question] Fine-tuning Gemma 4 Vision in Unsloth Studio for Medical Image Classification
Hi everyone,
I'm planning to fine-tune Gemma 4 (specifically for medical image classification/species identification) using Unsloth Studio.
My current dataset is a simple table: one column with the image and one column with the species name (label). However, I’ve noticed that Unsloth Studio’s UI doesn't seem to have a dedicated field to define the "input text prompt" (e.g., "What species is in this image?") when loading a custom dataset.
My Questions:
- How should I reformat my image + label dataset so Unsloth Studio recognizes it correctly for multimodal training?
- Do I need to convert my data into a ChatML-style
messagesformat before uploading? - Does the "instruction" need to be a hardcoded column in my CSV/Parquet file for every single row?
Setup:
- Model: Gemma 4 (E2B or E4B)
- Task: Medical Image Classification (Microscopic images)
- Environment: Unsloth Studio (Local/RunPod)
Any advice on the specific dataset schema required for the Studio would be greatly appreciated!
u/Electrical-Ebb4002 — 18 hours ago