Series
Fine-tuning

Adapting models: LoRA, RLHF, DPO, and when to bother.

Apr 15, 2026 · 5 minfinetuningllmdeeplearning
Efficient Training with Unsloth

Same GPU, same model, same LoRA config — and the run finishes in a third of the time using most of t

Apr 10, 2026 · 5 minfinetuningllmdeeplearning
Data Preparation for Fine-Tuning

Nobody demos the data cleaning.

Apr 3, 2026 · 5 minfinetuningllmdeeplearninggraph
Knowledge and Chain-of-Thought Distillation

You proved the task is solvable.

Mar 30, 2026 · 5 minfinetuningllmdeeplearning
GRPO, PPO, and KTO with TRL

DPO answered the common case.

Mar 20, 2026 · 5 minfinetuningllmdeeplearning
DPO vs RLHF

For a couple of years, teaching a model to prefer good answers over bad ones meant running three mod

Mar 12, 2026 · 4 minfinetuningllmdeeplearning
LoRA vs QLoRA vs DoRA vs Full Fine-Tuning

Four methods, one question: when you sit down to fine-tune, which do you reach for?

Mar 7, 2026 · 5 minfinetuningllmdeeplearning
QLoRA: Fine-Tuning on One GPU

Try to full-fine-tune an 8B model on a single 24 GB consumer card and you won't get to the first tra

Mar 3, 2026 · 5 minfinetuningllmdeeplearning
LoRA, Explained

A 7-billion-parameter model has 7 billion knobs.

Feb 23, 2026 · 5 minfinetuningllmdeeplearning
The Ladder: Prompt, RAG, Fine-tune, Distill

Most fine-tuning projects should have stayed a prompt.