Series
Gen AI Foundations

Transformers, attention, prompting, and core concepts.

Jun 10, 2026 · 6 minllmdeeplearning
Cost and Latency Engineering

The cheapest, fastest token is the one you never generate.

May 30, 2026 · 5 minllmdeeplearning
LLM-as-Judge

Using one language model to grade another feels like asking the fox to audit the henhouse and trust

May 26, 2026 · 5 minllmdeeplearningeval
Custom Evals Are the Moat

Your model scores 89 on MMLU.

May 18, 2026 · 5 minllmdeeplearning
Multimodal by Default

"A picture is worth a thousand words" is wrong by about an order of magnitude.

May 12, 2026 · 5 minllmdeeplearning
Why LLMs Hallucinate

Hallucination is not the model malfunctioning.

May 7, 2026 · 5 minllmdeeplearning
Structured Output and Constrained Decoding

There are two ways to get JSON out of a language model.

May 3, 2026 · 5 minllmdeeplearning
Context Windows and Lost-in-the-Middle

You bought a model with a giant context window.

Apr 27, 2026 · 5 minllmdeeplearning
The Transformer, Intuitively

Most explanations of the transformer open with a wall of matrices.

Apr 21, 2026 · 5 minllmdeeplearning
How LLMs Actually Generate Text

A language model does not write a sentence.