GenAI Syllabus
A structured curriculum for Production LLM Engineering — Transformer foundations, fine-tuning and alignment, multimodal and speech AI, RAG and retrieval engineering, agentic systems, and prompt/context engineering. Each module lists topics, learning objectives, and the tools and frameworks referenced — concept-first, not a coding tutorial.
Transformers and Architecture
Attention, tokenization, encoder/decoder families, fast inference, and scaling laws.
Module 01: Transformers and Tokenization
Syllabus for the foundational module on Transformer core mechanics and tokenization — embeddings, the attention family, encoder/decoder architectures, and subword tokenization strategies.
Module 02: Hands-On Fine-Tuning of Transformers
Syllabus for hands-on fine-tuning of the three Transformer families — implementing attention conceptually and adapting encoder-only, decoder-only, and encoder–decoder models to custom data.
Module 03: Fast Inference and Scaling Laws
Syllabus covering inference efficiency for Transformers — KV caching, Flash Attention, the modern attention variants (MQA, GQA, MLA, PagedAttention), RoPE positional encoding, and Chinchilla scaling laws.
LLM Fine-Tuning and Alignment
Pretraining lifecycle, SFT, PEFT, preference alignment, quantization, MoE, reasoning models, and SLMs.
Module 04: LLM Lifecycle and Pre-Training
Syllabus covering the two-phase LLM lifecycle — pre-training vs post-training, why base models need adaptation, continued pre-training for domain adaptation, and multi-token prediction.
Module 05: Datasets and Synthetic Data
Syllabus on preparing fine-tuning data — dataset formats, chat templates, loss masking, deduplication, and synthetic instruction and preference dataset generation with self-instruct and LLM-as-judge.
Module 06: SFT, PEFT and Preference Alignment
Syllabus on adapting and aligning LLMs — parameter-efficient fine-tuning (LoRA, QLoRA, DoRA, AdaLoRA, LoRA+), supervised fine-tuning, and preference alignment with RLHF and DPO.
Module 07: Evaluation, Quantization and Deployment
Syllabus covering post-fine-tuning workflows — benchmark and LLM-as-judge evaluation, quantization formats (GPTQ, AWQ, NF4, FP8, GGUF), serving with vLLM/SGLang/llama.cpp, and fine-tuning frameworks.
Module 08: Mixture of Experts
Syllabus on Mixture of Experts — why dense models hit scaling limits, the MoE routing idea, load balancing to avoid expert collapse, training and inference trade-offs, and when MoE beats dense.
Module 09: Reasoning Models and Chain-of-Thought
Syllabus on reasoning models — what distinguishes them from standard LLMs, chain-of-thought training, RL-only reasoning (GRPO, DeepSeek-R1-Zero), and distilling reasoning into smaller models.
Module 10: Small Language Models and Distillation
Syllabus on Small Language Models and knowledge distillation — why SLMs matter for cost, latency, and privacy, the student-teacher paradigm, soft labels, temperature scaling, KL divergence, and attention transfer.
Vision and Speech
CNNs to ViT, visual language models, and speech-to-text with Whisper.
Module 11: Vision Models — CNNs to ViT
Syllabus on vision foundations for multimodal AI — the conceptual bridge from CNNs to Vision Transformers, patch embedding, CLS token, 2D positional encoding, and pre-trained vision encoders (CLIP, SigLIP, DINOv2).
Module 12: Visual Language Models
Syllabus on Visual Language Models — the three-component VLM architecture (visual encoder, aligner/projector, LLM backbone), how the projector maps visual tokens into LLM space, and vision-language alignment training.
Module 13: Speech-to-Text with Whisper
Syllabus on Speech AI and Speech-to-Text — the STT landscape, Whisper's architecture and API, building production STT pipelines, and fine-tuning Whisper on domain-specific audio.
RAG and Retrieval
Embeddings, LangChain RAG, advanced RAG patterns, vector quantization, multimodal RAG, graph RAG and security.
Module 14: Embedding Models and Matryoshka Tuning
Syllabus on embedding models — the taxonomy from dense to binary, multi-vector embeddings, Matryoshka Representation Learning for flexible dimensions, and embedding fine-tuning for domain retrieval.
Module 15: LangChain for Production RAG
Syllabus on LangChain as the orchestration framework for production RAG — LCEL, integrations, prompting and structured output, memory and retrieval, agents and multimodal, observability and security.
Module 16: RAG Basics — Chunking and Retrieval
Syllabus on RAG foundations — building a baseline RAG system, choosing embedding models and chunking strategies, and implementing hybrid retrieval with BM25, SPLADE, and ColBERT-style multi-vector retrieval.
Module 17: Advanced RAG — Rerankers and Adaptive Retrieval
Syllabus on advanced RAG — query transformations, rerankers, self-correcting and adaptive retrieval patterns, contextual retrieval, systematic RAG evaluation, and agentic RAG.
Module 18: Vector Quantization and Multimodal RAG
Scale vector search with quantization (scalar, binary, product) and retrieve over visually rich documents with the ColPali multimodal RAG paradigm — no OCR required.
Module 19: Graph RAG, Caching and RAG Security
Round out a production RAG system — knowledge-graph retrieval with Graph RAG, vectorless patterns, semantic caching, and the security layer (PII masking, input/output guardrails, prompt-injection defence).
Agents and Multi-Agent Systems
Function calling, MCP, LangGraph, A2A protocol, observability, and Bedrock AgentCore deployment.
Module 20: Agent Basics — Function Calling and MCP
Syllabus on agent foundations — validating and structuring outputs with Pydantic, cross-provider function calling, building tool executor loops, and standardising tool integration with the Model Context Protocol (MCP).
Module 21: LangGraph for Multi-Agent Workflows
Syllabus on LangGraph — graph-based stateful agent workflows, the ReAct pattern, human-in-the-loop checkpoints, memory and persistence, multi-agent orchestration, and the LangGraph Platform.
Module 22: Agent Observability and A2A Protocol
Make agents production-grade — observability with LangSmith and Logfire, plus the A2A (Agent-to-Agent) protocol for cross-framework interoperability.
Module 23: Deploying Agents with Bedrock AgentCore
Deploy secure, isolated agents with Amazon Bedrock AgentCore — Runtime, Memory, Gateway, Identity, Browser/Code Interpreter, Observability, and Cedar-based policy guardrails.
Prompting, Context and Evaluation
Prompt engineering, context engineering, and evaluation harnesses with agent CI/CD.
Module 24: Prompt Engineering
Syllabus on prompt engineering — prompt anatomy, few-shot and chain-of-thought techniques, system prompt design, structured output, robustness testing, chaining, and self-refinement.
Module 25: Context Engineering
Syllabus on context engineering — context window anatomy, RAG as context construction, memory architectures (short-term, episodic, semantic, procedural), and context compression with LLMLingua and RECOMP.
Module 26: Evaluation Harnesses and Agent CI/CD
Syllabus on evaluation and LLMOps CI/CD — structured eval harnesses, LLM benchmarking, agent-native evaluation with Inspect AI, LLM-as-judge, execution sandboxes, prompt regression, and eval-gated agent CI/CD.
Capstone Projects
End-to-end projects integrating fine-tuning, distillation, RAG, multi-agent systems, speech, and LLMOps.
Project 01: ClinicLLM — Medical LLM Fine-Tuning Pipeline
Build a domain-specific medical language model with QLoRA SFT and DPO preference alignment, then serve it with multiple hot-swappable LoRA adapters.
Project 02: TinyReason — Distilling Reasoning to a CPU Model
Compress a larger reasoning teacher into a small student using KL divergence and attention transfer, then quantize to GGUF for cost-efficient CPU inference.
Project 03: LegalRAG — Hybrid Multi-Modal + Graph RAG for Contracts
Build a production legal-document intelligence system combining ColPali multimodal indexing (no OCR), a Neo4j knowledge graph, hybrid retrieval with RRF, cross-encoder reranking, and RAGAS-gated evaluation.
Project 04: DevOpsCrew — Multi-Agent DevOps with HITL and A2A
Build a production DevOps assistant with a LangGraph supervisor delegating to specialised sub-agents over MCP and A2A, with human-in-the-loop gating every write.
Project 05: EvalShip — Eval-Gated CI/CD with Auto-Rollback
Wrap all prior projects in a production LLMOps shell where every code or prompt change must pass eval-gated CI stages before deployment, with blue/green and auto-rollback.
Project 06: VoiceTrack — Whisper Fine-Tuning and Production STT Pipeline
Fine-tune Whisper on domain-specific audio and ship a production STT service with streaming transcription, diarisation hooks, and an evaluation gate on WER.