Retrieval-Augmented Generation — from naive pipelines to production systems.
A support agent escalated a ticket because the assistant kept telling customers about a discount tha
Every time a model ships with a bigger context window, the same headline returns: *RAG is dead.
A team swaps in a new reranker and declares the RAG system "better.
Eight posts in, we've built up an arsenal: hybrid search, reranking, HyDE, multi-query, the agentic
Ask a normal RAG system "what are the major themes across these 400 board meeting transcripts?
Every pipeline in this series so far has been a conveyor belt.
"why is it slow" That's a real query a real user typed into a real RAG system.…
Most accuracy improvements in RAG cost you something painful — a new index, a bigger model, a re-arc
Dense vector search, the thing this whole series has been building on, has a stupid failure mode: it
An embedding is not a summary of meaning.
Pick a chunk size of 500 tokens and you've made a decision worth more than your choice of embedding
The first RAG demo always works.