Module 16: RAG Basics, Chunking and Retrieval

Module Overview

This module covers the core RAG building blocks. It starts from a vanilla pipeline, then examines the decisions that most affect quality, embedding model selection and chunking, before introducing hybrid retrieval that combines lexical and dense signals with multi-vector late interaction.

Learning Objectives

Describe the components of a vanilla RAG pipeline.
Select an embedding model appropriate to a corpus and query type.
Choose chunking strategies and explain their impact on context quality.
Compare BM25, SPLADE, and ColBERT-style retrieval.

Topics Covered

Foundations of RAG

Vanilla RAG
Embedding models for retrieval
Chunking strategies
BM25 (lexical sparse retrieval)
SPLADE (learned sparse retrieval)
Multi-vector ColBERT (late-interaction retrieval)

Key Concepts & Terminology

Retrieve-then-generate, fixed vs recursive vs semantic chunking, chunk overlap, lexical vs dense vs hybrid retrieval, MaxSim late interaction.

Tools & Frameworks Referenced

BM25, SPLADE, ColBERT-style multi-vector retrieval, vector stores.

Prerequisites

Module 14 (embeddings) and Module 15 (LangChain for orchestration).

Module 16: RAG Basics, Chunking and Retrieval

Module Overview

Learning Objectives

Topics Covered

Foundations of RAG

Key Concepts & Terminology

Tools & Frameworks Referenced

Prerequisites

Found this useful? Keep building with me.

Latest recommendations you might like

Module 14: Embedding Models and Matryoshka Tuning

Module 15: LangChain for Production RAG

Module 17: Advanced RAG, Rerankers and Adaptive Retrieval

Module 18: Vector Quantization and Multimodal RAG

Find this tutorial useful?

Discussion & Comments