Ollama Setup Guide
Install Ollama, master every CLI command, call the REST API, and build a custom persona model using a Modelfile — all locally.
Dive into Large Language Models (LLMs), Retrieval-Augmented Generation (RAG), Agentic AI Workflows, LangGraph, custom assistants, and production deployment of AI applications.
Browse, search, and work through all available articles for this category.
Install Ollama, master every CLI command, call the REST API, and build a custom persona model using a Modelfile — all locally.
Install LangChain and langchain-ollama, configure environment variables, connect to a local Ollama model, and invoke and stream chat completions in Python.
Master LangChain prompt templates — message roles, SystemMessage, HumanMessage, and ChatPromptTemplate with dynamic variables.
Master LangChain Expression Language (LCEL) — sequential, parallel, router, and custom chains with the pipe operator, RunnableParallel, and @chain.
Parse LLM responses into structured Python objects — Pydantic models, JSON dicts, and CSV lists — using LangChain's output parsers and with_structured_output().
Add persistent chat memory to any LangChain chain — store and replay multi-turn conversation history using RunnableWithMessageHistory and SQLChatMessageHistory.
Build a streaming, multi-session chatbot web app with LangChain, Ollama, and Streamlit — using persistent SQL memory and token-by-token streaming output.
Load PDFs, webpages, PowerPoint, Excel, and Word files into LangChain for Q&A and summarization — plus MarkitDown and Docling for advanced extraction.
Build a FAISS vector store from PDF documents using Ollama embeddings — chunk, embed, index, and retrieve semantically relevant content for RAG applications.
Build a complete RAG chain that loads a persisted FAISS vector store and answers questions grounded strictly in your own documents using LangChain and Ollama.
Define custom tools, bind them to an LLM, call multiple tools in parallel, and generate grounded final answers using LangChain's tool-calling API with Ollama.
Build autonomous LangChain v1 agents with create_agent — wire web search tools, tune model parameters, switch models dynamically, and stream responses.
Turn a FAISS vector store into an agent tool and build Agentic RAG that retrieves document context only when relevant, with a streaming chat loop.
Build a natural language SQL agent with five tools — schema inspection, query generation, validation, execution, and error correction.
Scrape LinkedIn profiles with Selenium and BeautifulSoup, clean the raw HTML, and structure the data as JSON with a two-pass LLM pipeline.
Extract structured data from PDF resumes with PyMuPDF and an LLM, guaranteeing valid JSON via JsonOutputParser in a two-stage pipeline.
Wrap the two-stage LLM resume parser in a Streamlit web app — upload PDFs and view extracted JSON data in real time.
A hands-on test of Claude Fable 5 on a live financial consulting project — PDF analysis, legacy code migration, reconciliation debugging, and autonomous reports.
A comprehensive guide on progressive benchmarking for Large Language Models using structured prompt architectures and high-quality browser-based tests.
A detailed technical guide comparing dense, sparse MoE, and hybrid SSM local LLM architectures for optimal deployment on 24GB VRAM hardware systems in 2026.
Step-by-step guide to deploying RAGWire, an OpenAI-compatible FastAPI RAG server, on Railway, Render, AWS ECS, GCP Cloud Run, and Azure Container Apps.
Full API reference for RAGWire's OpenAI-compatible FastAPI endpoints covering health checks, model listing, chat completions, and document ingestion.
Step-by-step guide to integrating Hunyuan3D-2 with Blender MCP so Claude Desktop can generate and place 3D assets inside Blender scenes via natural language.