Module Overview
This module covers the core RAG building blocks. It starts from a vanilla pipeline, then examines the decisions that most affect quality — embedding model selection and chunking — before introducing hybrid retrieval that combines lexical and dense signals with multi-vector late interaction.
Learning Objectives
- Describe the components of a vanilla RAG pipeline.
- Select an embedding model appropriate to a corpus and query type.
- Choose chunking strategies and explain their impact on context quality.
- Compare BM25, SPLADE, and ColBERT-style retrieval.
Topics Covered
Foundations of RAG
- Vanilla RAG
- Embedding models for retrieval
- Chunking strategies
- BM25 (lexical sparse retrieval)
- SPLADE (learned sparse retrieval)
- Multi-vector ColBERT (late-interaction retrieval)
Key Concepts & Terminology
Retrieve-then-generate, fixed vs recursive vs semantic chunking, chunk overlap, lexical vs dense vs hybrid retrieval, MaxSim late interaction.
Tools & Frameworks Referenced
BM25, SPLADE, ColBERT-style multi-vector retrieval, vector stores.
Prerequisites
Module 14 (embeddings) and Module 15 (LangChain for orchestration).