Getting Started with Gemini 3 & LangChain
Set up Gemini 3, LangChain, and LangSmith, then explore streaming, multimodal input, tool calling, reasoning, and context caching for AI agents.
Dive into Large Language Models (LLMs), Retrieval-Augmented Generation (RAG), Agentic AI Workflows, LangGraph, custom assistants, and production deployment of AI applications.
Browse, search, and work through all available articles for this category.
Set up Gemini 3, LangChain, and LangSmith, then explore streaming, multimodal input, tool calling, reasoning, and context caching for AI agents.
Master LangChain agents end to end: tools, short- and long-term memory, streaming, middleware, guardrails, human-in-the-loop, and prompt engineering.
Learn the Model Context Protocol and build a hotel search agent that connects an Airbnb MCP server to a Gemini agent for live listings, weather, and web search.
Build a travel planner agent that combines Airbnb and Google Calendar MCP servers with memory to plan trips and add itineraries to your calendar.
Give an AI agent a secure E2B cloud sandbox to run Python, analyze CSV and Excel data, and generate charts without touching your machine.
Connect Google Sheets and Yahoo Finance MCP servers so an agent can read spreadsheets, analyze data, and report financial insights from the terminal.
Combine Gmail, Google Calendar, Yahoo Finance, weather, and web search MCP tools into one agent that delivers a personalized morning briefing.
Serve your agent through FastAPI streaming endpoints and build a Streamlit chat client that consumes the live token stream with memory and PDF export.
Load an e-commerce SQLite database into a cloud MySQL server and connect it to a read-only streaming agent that answers questions over real data.
Get started with Hugging Face Transformers: run pretrained models in one line with pipelines for text, image, and audio tasks like classification, QA, and translation.
Understand the transformer architecture from the ground up: why RNNs fell short, how self-attention works, and how encoders and decoders power modern LLMs.
Learn how BERT's bidirectional encoder works, how masked language modeling pretrains it, and how a classification head adapts it to downstream NLP tasks.
Fine-tune BERT for multi-class emotion classification on Twitter tweets using Hugging Face Transformers, the Trainer API, and a custom evaluation function.
Understand knowledge distillation and how DistilBERT, TinyBERT, and MobileBERT compress BERT into smaller, faster models that keep most of its accuracy.
Fine-tune DistilBERT, MobileBERT, and TinyBERT to detect fake news, then benchmark their accuracy and speed against full BERT in a head-to-head comparison.
Fine-tune DistilBERT for named entity recognition on restaurant search queries, extracting cuisines, locations, ratings, and dishes with IOB tagging and seqeval.
Fine-tune the T5 text-to-text model for abstractive dialogue summarization on the SAMSum dataset using Hugging Face's Seq2Seq Trainer and data collator.
Fine-tune a Vision Transformer (ViT) to classify Indian food images with Hugging Face, using an image processor, the Trainer API, and patch-based attention.
Learn the theory behind PEFT, LoRA, and QLoRA, then fine-tune Microsoft's Phi-2 on a custom product dataset with quantization on a single GPU.
Turn the TinyLlama 1.1B base model into a conversational assistant with 4-bit QLoRA, chat templates, and the TRL SFTTrainer on a single GPU.
Set up a complete MCP development environment on Windows, Anaconda, Ollama, Node.js, Claude Desktop, and the uv package manager, with Linux/macOS notes.
Understand the Model Context Protocol, client, host, and server roles, the JSON-RPC foundation, tools/resources/prompts, and stdio vs Streamable HTTP transports.
Build your first MCP server with FastMCP, a math tool and a live weather tool, then call them from both the raw MCP SDK client and the FastMCP client.
Connect local Ollama LLMs to MCP servers as an autonomous agent using mcp-use, run multiple servers, switch stdio/HTTP transports, and use a server manager.
Register your own and community MCP servers in Claude Desktop, edit claude_desktop_config.json, add stdio servers with uv/npx/uvx, and use them safely.
Turn Claude Desktop into a no-code data analyst, read an Excel support-ticket dataset with an MCP server and auto-generate a PowerPoint analysis report.
Master MCP's three building blocks, tools, resources, and prompts, by building a job-search assistant server with FastMCP, the JSearch API, and a resume resource.
Build an MCP vector-database server with FastMCP, ChromaDB, and Ollama embeddings, ingest PDFs from a file, folder, or URL, then query it from a Streamlit agent.
Build a research assistant that crawls the web with Firecrawl and stores findings in per-topic ChromaDB vector stores, orchestrated by a LangGraph agent over multiple MCP servers.
Deploy a remote MCP server on an Amazon Linux EC2 instance, SSH in, install Anaconda and uv, serve over Streamable HTTP with OpenAI embeddings, and connect Claude Desktop.
Turn your MCP server into a pip-installable package, restructure to a src layout, configure pyproject.toml and an entry point, build with uv, and publish to PyPI.
Install Ollama, master every CLI command, call the REST API, and build a custom persona model using a Modelfile, all locally.
Install LangChain and langchain-ollama, configure environment variables, connect to a local Ollama model, and invoke and stream chat completions in Python.
Master LangChain prompt templates, message roles, SystemMessage, HumanMessage, and ChatPromptTemplate with dynamic variables.
Master LangChain Expression Language (LCEL), sequential, parallel, router, and custom chains with the pipe operator, RunnableParallel, and @chain.
Parse LLM responses into structured Python objects, Pydantic models, JSON dicts, and CSV lists, using LangChain's output parsers and with_structured_output().
Add persistent chat memory to any LangChain chain, store and replay multi-turn conversation history using RunnableWithMessageHistory and SQLChatMessageHistory.
Build a streaming, multi-session chatbot web app with LangChain, Ollama, and Streamlit, using persistent SQL memory and token-by-token streaming output.
Load PDFs, webpages, PowerPoint, Excel, and Word files into LangChain for Q&A and summarization, plus MarkitDown and Docling for advanced extraction.
Build a FAISS vector store from PDF documents using Ollama embeddings. Chunk, embed, index, and retrieve semantically relevant content for RAG applications.
Build a complete RAG chain that loads a persisted FAISS vector store and answers questions grounded strictly in your own documents using LangChain and Ollama.
Define custom tools, bind them to an LLM, call multiple tools in parallel, and generate grounded final answers using LangChain's tool-calling API with Ollama.
Build autonomous LangChain v1 agents with create_agent, wire web search tools, tune model parameters, switch models dynamically, and stream responses.
Turn a FAISS vector store into an agent tool and build Agentic RAG that retrieves document context only when relevant, with a streaming chat loop.
Build a natural language SQL agent with five tools: schema inspection, query generation, validation, execution, and error correction.
Scrape LinkedIn profiles with Selenium and BeautifulSoup, clean the raw HTML, and structure the data as JSON with a two-pass LLM pipeline.
Extract structured data from PDF resumes with PyMuPDF and an LLM, guaranteeing valid JSON via JsonOutputParser in a two-stage pipeline.
Wrap the two-stage LLM resume parser in a Streamlit web app, upload PDFs and view extracted JSON data in real time.
Build a page-wise PDF ingestion pipeline with Docling, filename metadata, SHA-256 deduplication, and local nomic-embed-text embeddings stored in ChromaDB.
Hybrid retrieval over SEC filings, LLM metadata filters and SEC keywords, MMR search in ChromaDB, and BM25Plus re-ranking, packaged as reusable LangChain tools.
Build a ReAct agent in LangGraph that wraps retrieval as a tool, decides when to call it, decomposes comparison questions, and answers SEC filings with citations.
Build a self-correcting CRAG workflow in LangGraph that grades retrieved documents, rewrites weak queries, and falls back to web search before answering.
Build a Reflexion agent in LangGraph that drafts an answer, reflects on missing information, retrieves to fill gaps, and revises iteratively until complete.
Build Self-RAG in LangGraph with document grading plus hallucination and answer-quality gates that regenerate or rewrite until the answer is grounded and useful.
Build Adaptive RAG in LangGraph that routes each query to a vector store, a SQL employee database, or live web search, with SQLite short-term memory.
Learn RAGWire's production RAG architecture, configure ingestion and retrieval pipelines, and build a filter-aware agent in a Jupyter notebook.
Swap between Ollama, OpenAI, Gemini, Groq, and HuggingFace with a single config change. Use Qdrant Cloud with MMR retrieval.
Build a domain-specific RAG pipeline for health supplement research papers with custom metadata extraction and hybrid retrieval.
Build an end-to-end conversational RAG chatbot in a single file with RAGWire, LangChain agent, and Chainlit UI.
Step-by-step guide to deploying RAGWire, an OpenAI-compatible FastAPI RAG server, on Railway, Render, AWS ECS, GCP Cloud Run, and Azure Container Apps.
Build a production Chainlit frontend with SSE streaming, persistent SQLite chat history, password auth, and PDF export.
Build a parallel multi-agent RAG workflow with Microsoft Agent Framework, four specialists, a fan-in aggregator, and a synthesizer.
Full API reference for RAGWire's OpenAI-compatible FastAPI endpoints covering health checks, model listing, chat completions, and document ingestion.
Build custom LangChain agents with SQLite memory checkpointers, structured outputs, stream updates, and custom middleware for PII and todo planning.
Build a specialized financial research agent utilizing Model Context Protocol (MCP) and Yahoo Finance server tools for real-time stock analysis.
Design and implement a complete financial document ingestion pipeline extracting text, tables, and charts with Docling and storing them in Qdrant with hybrid vector embeddings.
Master advanced retrieval strategies by combining dense embeddings, sparse BM25 tokenizers, dynamic metadata filtering, and Cross-Encoder reranking.
Build a dual-tool multimodal financial research agent with persistent SQLite session memory, historical hybrid search, and live Yahoo Finance MCP integration.
Implement a task-planning agentic RAG system that decomposes complex financial comparison queries into checklists using TodoListMiddleware and SummarizationMiddleware.
Implement a hierarchical multi-agent research team coordinating Orchestrator, Researcher, and Editor agents on a real filesystem using LangGraph.
Deploy a production-ready LangChain DeepAgent financial research system utilizing sandboxed filesystem backends, LangGraph CLI servers, and a custom UI frontend.
Learn the fundamentals of LangGraph. Master State, Nodes, and Edges to build stateful, multi-agent applications using a Finite State Machine model in Python.
Learn how to implement conditional routing in LangGraph. Use Pydantic to structure LLM outputs and route execution dynamically based on sentiment analysis.
Learn how to build a stateful ReAct agent in LangGraph. Define custom tools for weather and math calculations, bind them to an LLM, and loop executions.
Learn how to implement thread-based conversation memory with MemorySaver checkpointer and stream graph outputs in LangGraph with local Ollama models.
Production-ready short-term memory using SQLite and PostgreSQL checkpointers, conversations that survive server restarts with full thread isolation.
Build agents with cross-thread long-term memory using PostgresStore, store user preferences, search semantically, and personalize responses across sessions.
Pause agent execution for human approval using interrupt(), resume with the Command API, and protect users with a regex PII guardrail node.
Build a self-improving research agent with a researcher node, critique node, and iterative reflection loop, web search, evaluate, revise, repeat.
Build a natural language SQL agent with LangGraph, dedicated tools for schema inspection, query generation, validation, execution, and error fixing.
Connect LangGraph agents to external MCP servers, build an Airbnb search agent using langchain-mcp-adapters and the @openbnb/mcp-server-airbnb package.
A complete, hands-on walkthrough for building a free customer support AI agent with open-source Hexabot and a free OpenRouter model, deploying it on Railway with Postgres, grounding it in your own content with RAG, and embedding the live-chat widget on your live website.
A hands-on test of Claude Fable 5 on a live financial consulting project, PDF analysis, legacy code migration, reconciliation debugging, and autonomous reports.
A comprehensive guide on progressive benchmarking for Large Language Models using structured prompt architectures and high-quality browser-based tests.
A detailed technical guide comparing dense, sparse MoE, and hybrid SSM local LLM architectures for optimal deployment on 24GB VRAM hardware systems in 2026.
Step-by-step guide to integrating Hunyuan3D-2 with Blender MCP so Claude Desktop can generate and place 3D assets inside Blender scenes via natural language.