u/Electrical-Ebb4002

Hi everyone,

I'm planning to fine-tune Gemma 4 (specifically for medical image classification/species identification) using Unsloth Studio.

My current dataset is a simple table: one column with the image and one column with the species name (label). However, I’ve noticed that Unsloth Studio’s UI doesn't seem to have a dedicated field to define the "input text prompt" (e.g., "What species is in this image?") when loading a custom dataset.

My Questions:

How should I reformat my image + label dataset so Unsloth Studio recognizes it correctly for multimodal training?
Do I need to convert my data into a ChatML-style messages format before uploading?
Does the "instruction" need to be a hardcoded column in my CSV/Parquet file for every single row?

Setup:

Model: Gemma 4 (E2B or E4B)
Task: Medical Image Classification (Microscopic images)
Environment: Unsloth Studio (Local/RunPod)

Any advice on the specific dataset schema required for the Studio would be greatly appreciated!

[Question] Fine-tuning Gemma 4 Vision in Unsloth Studio for Medical Image Classification