Series
RAG

Retrieval-Augmented Generation — from naive pipelines to production systems.

Sep 2, 2025 · 6 minragllmproduction
RAG in Production

A support agent escalated a ticket because the assistant kept telling customers about a discount tha

Aug 28, 2025 · 5 minragllm
Long Context vs RAG

Every time a model ships with a bigger context window, the same headline returns: *RAG is dead.

Aug 20, 2025 · 6 minragllmeval
Evaluating RAG: RAGAS and Beyond

A team swaps in a new reranker and declares the RAG system "better.

Aug 13, 2025 · 6 minragllm
Adaptive RAG: Matching Pipeline to Query

Eight posts in, we've built up an arsenal: hybrid search, reranking, HyDE, multi-query, the agentic

Aug 10, 2025 · 6 minragllmevalsearchgraph
GraphRAG: Retrieval Over Knowledge Graphs

Ask a normal RAG system "what are the major themes across these 400 board meeting transcripts?

Aug 6, 2025 · 6 minragllmevalsearch
Agentic RAG: When Retrieval Starts Thinking

Every pipeline in this series so far has been a conveyor belt.

Jul 30, 2025 · 5 minragllm
Query Transformation: HyDE and Multi-Query

"why is it slow" That's a real query a real user typed into a real RAG system.…

Jul 24, 2025 · 5 minragllm
Reranking: The Cheap Accuracy Win

Most accuracy improvements in RAG cost you something painful — a new index, a bigger model, a re-arc

Jul 19, 2025 · 6 minragllmsearch
Hybrid Search: BM25 Meets Dense Vectors

Dense vector search, the thing this whole series has been building on, has a stupid failure mode: it

Jul 11, 2025 · 6 minragllm
Embeddings and Vector Stores, Demystified

An embedding is not a summary of meaning.

Jul 8, 2025 · 6 minragllm
Chunking Strategies That Actually Matter

Pick a chunk size of 500 tokens and you've made a decision worth more than your choice of embedding

Jul 1, 2025 · 6 minragllm
Naive RAG and Why It Disappoints

The first RAG demo always works.