u/Altruistic_File_6117

A quick overview of Fine-Tuning approaches in Large Language Models

A quick overview of Fine-Tuning approaches in Large Language Models

https://preview.redd.it/dfkqex222k0h1.png?width=972&format=png&auto=webp&s=70cce871347cf2d01df04078387849ca621245ea

Hey everyone 👋
I’ve been trying to organize the different types of fine-tuning used in modern LLMs, and I made a simple “map” to help visualize how they relate to each other.

Fine-tuning in general is the process of adapting a pre-trained model to a specific task or domain, but it has evolved into several directions:

  • Full Fine-Tuning: updating all model weights (powerful but expensive)
  • Instruction Fine-Tuning: training on instruction-response datasets to improve general usability
  • PEFT (Parameter-Efficient Fine-Tuning): updating only small parts of the model
    • LoRA (Low-Rank Adaptation): injecting trainable low-rank matrices
    • Adapters: small layers inserted between transformer blocks
    • Prefix Tuning: learning task-specific prefix tokens
    • Prompt Tuning: optimizing soft prompts instead of weights
  • RLHF (Reinforcement Learning from Human Feedback): aligning outputs with human preferences
  • Domain-Specific Fine-Tuning: adapting to medical, legal, or financial text

I tried to visualize how these methods branch from standard fine-tuning and where each one fits in terms of efficiency vs performance.

Would love feedback if I missed anything or if you’d structure it differently.

reddit.com
u/Altruistic_File_6117 — 3 days ago