SQL Agent with LangGraph Workflows

Build a natural language SQL agent with LangGraph — dedicated tools for schema inspection, query generation, validation, execution, and error fixing.

Jun 15, 202613 min readFollow

Topics You Will Master

Connecting to a SQLite database using SQLDatabase.from_uri() and retrieving schema information
Building four SQL tools: get_database_schema, generate_sql_query, execute_sql_query, and fix_sql_error
Inline query validation that blocks INSERT, UPDATE, DELETE, ALTER, DROP, CREATE, and TRUNCATE
Using regex to strip markdown SQL code block markers from LLM-generated queries

Text-to-SQL lets users query databases in plain English. Instead of writing SELECT AVG(salary) FROM salaries, a user asks "What is the average salary?" and the agent generates, validates, and executes the query automatically. This lesson builds that agent with a safety-first approach: destructive operations are blocked by regex validation, and failed queries are automatically corrected.

The dataset is the employees database — a SQLite port of the classic MySQL employees test database with 300,024 employee records across 6 tables.

Prerequisites: langgraph, langchain-ollama, langchain-community, langchain-core, python-dotenv installed. Ollama running with qwen3. The employees_db-full-1.0.6.db SQLite file in a db/ subdirectory.

BASH
pip install -U langgraph langchain-ollama langchain-community langchain-core python-dotenv
ollama pull qwen3

LangGraph & Ollama — AI Agent Development

Build production-ready AI agents with persistent memory, tool calling, and multi-model workflows using LangGraph and local LLMs.

Enroll on Udemy →

Setup and Database Connection

PYTHON
from typing_extensions import TypedDict, Annotated
from typing import List
import os
import re 
import operator

from langgraph.graph import StateGraph, START, END
from langgraph.prebuilt import ToolNode
from langchain_ollama import ChatOllama
from langchain_core.tools import tool
from langchain_core.messages import HumanMessage, AIMessage, SystemMessage

from langchain_community.utilities import SQLDatabase

from dotenv import load_dotenv
load_dotenv()

LLM Configuration

PYTHON
LLM_MODEL = "qwen3"
BASE_URL = "http://localhost:11434"

llm = ChatOllama(model=LLM_MODEL, base_url=BASE_URL)

response = llm.invoke("Hello, how are you?")
response.pretty_print()

Database Setup

Connect to the SQLite employees database and load the full schema:

PYTHON
db = SQLDatabase.from_uri('sqlite:///db/employees_db-full-1.0.6.db')

tables = db.get_usable_table_names()


SCHEMA = db.get_table_info()

SCHEMA contains CREATE TABLE statements with sample rows for all 6 tables — it is passed to the LLM as context when generating queries.


SQL Tools

Diagram showing the five SQL tools: inspect schema, generate, validate, execute, and fix queries

Tool 1 — get_database_schema

Retrieves schema information for a specific table or all tables at once. The agent calls this first to understand table structure before generating queries:

PYTHON
@tool
def get_database_schema(table_name: str = None):
    """Get database schema information for SQL query generation.
    Use this first to understand table structure before creating queries."""

    if table_name:
        tables = db.get_usable_table_names()
        if table_name.lower() in [t.lower() for t in tables]:
            result = db.get_table_info([table_name])
            return result
        
        else:
            return f"Error: Table '{table_name}' not found. Available tables: '{', '.join(tables)}'"
    else:
        return SCHEMA

Tool 2 — generate_sql_query

Generates a SELECT query from a natural language question using the database schema as context:

PYTHON
@tool
def generate_sql_query(question: str, schema_info: str=None):
    """Generate a SQL SELECT query from a natural language question using database schema.
        Always use this after getting schema information."""
    
    schema_to_use = schema_info if schema_info else SCHEMA

    prompt = f"""Based on this database schema:
                {schema_to_use}

                Generate a SQL query to answer this question: {question}

                Rules:
                - Use only SELECT statements
                - Include only existing columns and tables
                - Add appropriate WHERE, GROUP BY, ORDER BY clauses as needed
                - Limit results to 10 rows unless specified otherwise
                - Use proper SQL syntax for SQLite

                Return only the SQL query, nothing else."""
    
    response = llm.invoke(prompt)
    sql_query = response.content.strip()
    print(f"[TOOL] Generated SQL Query: {sql_query[:30]}...")
    return sql_query

Tool 3 — validate_sql_query

Validates generated SQL for safety before execution. Strips markdown code block markers and blocks all destructive operations:

Diagram showing validation stripping markdown, enforcing SELECT-only, and blocking seven destructive keywords

PYTHON
@tool
def validate_sql_query(query: str):
    """Validate SQL query for safety and syntax before execution.
        Returns 'Valid: <query>' if safe or 'Error: <message>' if unsafe."""
    
    clean_query = query.strip()

    # remove sql code block
    clean_query = re.sub(r'```sql\s*', '', clean_query, flags=re.IGNORECASE)
    clean_query = re.sub(r'```\s*', '', clean_query, flags=re.IGNORECASE)

    clean_query = clean_query.strip().rstrip(';')

    if not clean_query.lower().startswith('select'):
        return "Error: only 'select' statements are allowed."
    
    # Check 2: Block dangerous SQL keywords
    dangerous_keywords = ['INSERT', 'UPDATE', 'DELETE', 'ALTER', 'DROP', 'CREATE', 'TRUNCATE']
    query_upper = clean_query

    for keyword in dangerous_keywords:
        if keyword in query_upper:
            return f"Error: {keyword} operations are not allowed."
        
    print("[TOOL] Your sql query is validated. Passed!")
    return clean_query

Tool 4 — execute_sql_query

Executes a validated query against the database. Calls validate_sql_query internally before running:

PYTHON
@tool
def execute_sql_query(sql_query: str):
    """Execute a validated SQL query and return results.
    Only use this after validating the query for safety."""


    query = validate_sql_query.invoke(sql_query)
    if query.startswith('Error:'):
        return f"Query '{sql_query}' validation failed with Error: {query}"
    
    result = db.run(query)

    if result:
        return  f"Query Results: {result}"
    
    else:
        return f"Query Executed Sucessfully but No Result was Found!"

Tool 5 — fix_sql_error

When a query fails validation or execution, this tool generates a corrected version using the error message and original question as context:

PYTHON
@tool
def fix_sql_error(original_query: str, error_message: str, question):
    """Fix a failed SQL query by analyzing the error and generating a corrected version.
        Use this when validation or execution fails."""
    
    fix_prompt = f"""The following SQL query failed:
                    Query: {original_query}
                    Error: {error_message}
                    Original Question: {question}

                    Database Schema:
                    {SCHEMA}

                    Analyze the error and provide a corrected SQL query that:
                    1. Fixes the specific error mentioned
                    2. Still answers the original question
                    3. Uses only valid table and column names from the schema
                    4. Follows SQLite syntax rules

                    Return only the corrected SQL query, nothing else."""
    
    response = llm.invoke(fix_prompt)
    query = response.content.strip()

    print(f"[TOOL] Generated fixed SQL Query.")

    return query

Agent Node and State

PYTHON
class AgentState(TypedDict):
    messages: Annotated[list, operator.add]

tools = [
    get_database_schema,
    generate_sql_query,
    execute_sql_query,
    fix_sql_error
]

llm_with_tools = llm.bind_tools(tools)

The agent node prepends a system prompt that defines the explicit SQL workflow — schema retrieval, query generation, execution, and error fixing:

PYTHON
def agent_node(state: AgentState):
    
    system_prompt = f"""You are an expert SQL analyst working with an employees database.

                    Database Schema:
                    {SCHEMA}

                    Your workflow for answering questions:
                    1. Use `get_database_schema` first to understand available tables and columns (if needed)
                    2. Use `generate_sql_query` to create SQL based on the question
                    3. Use `execute_sql_query` to run the validated query
                    4. If there's an error, use `fix_sql_error` to correct it and try again (up to 3 times)

                    Rules:
                    - Always follow the workflow step by step
                    - If a query fails, use the fix tool and try again
                    - Provide clear, informative answers
                    - Be precise with table and column names
                    - Handle errors gracefully and try to fix them
                    - If you fail after 3 attempts, explain what went wrong

                    Available tools:
                    - get_database_schema: Get table structure info
                    - generate_sql_query: Create SQL from question
                    - execute_sql_query: Run the query
                    - fix_sql_error: Fix failed queries

                    Remember: Always validate queries before executing them for safety."""
    

    messages = [SystemMessage(system_prompt)] + state['messages']

    response = llm_with_tools.invoke(messages)

    return {'messages': [response]}

Router Logic

PYTHON
def should_continue(state: AgentState):

    last_message = state['messages'][-1]

    if hasattr(last_message, 'tool_calls') and last_message.tool_calls:
        print("[TOOL] Calling the tool")
        for tc in last_message.tool_calls:
            print(f"Callling tool: '{tc['name']}' with Args: '{tc['args']}'")

        return 'tools'

    else:
        print("[AGENT] Agent is processing your request...")
        return END

Build the Graph

Diagram showing the LangGraph loop where a conditional edge cycles the agent and tools until the query resolves

PYTHON
def create_sql_agent():

    builder = StateGraph(AgentState)

    # add nodes
    builder.add_node('agent', agent_node)
    builder.add_node('tools', ToolNode(tools))

    # add edges
    builder.add_edge(START, 'agent')
    builder.add_edge('tools', 'agent')

    # add router
    builder.add_conditional_edges('agent', should_continue, ['tools', END])

    graph = builder.compile()

    return graph

agent = create_sql_agent()
agent

End-to-End Demos

Employee Count

PYTHON
query = "how many employees are there"
result = agent.invoke({'messages': [query]})

Verify the result directly:

PYTHON
db.run('SELECT COUNT(*) FROM employees;')
PYTHON
result['messages'][-1].pretty_print()

The agent called generate_sql_query to produce SELECT COUNT(*) FROM employees;, validated it, executed it, and reported 300,024 employees.

Average Salary by Department

PYTHON
query = "What is the average salary of each department show me top 5 department?"
result = agent.invoke({'messages': [query]})
result['messages'][-1].pretty_print()

The agent generates a multi-table JOIN query across employees, dept_emp, departments, and salaries tables, groups by department, orders by average salary descending, and limits to 5 results.

Top Paid Employees

PYTHON
query = "Show me the top 5 highest paid employees with their title and salaries?"
result = agent.invoke({'messages': [query]})
result['messages'][-1].pretty_print()

This query requires joining employees, salaries, and titles — the agent generates the correct three-table join with ORDER BY salary DESC LIMIT 5.


The SQL Agent Workflow

Diagram showing the SQL workflow: schema, generate, and execute, with failed queries fixed and retried automatically


What You Built

In this lesson you built a Text-to-SQL agent with LangGraph:

Tool Role Safety
get_database_schema Inspect table structure before generating queries Read-only
generate_sql_query LLM generates SELECT query from natural language SELECT only (instructed)
validate_sql_query Regex validation — blocks 7 destructive keywords Enforced by code, not LLM
execute_sql_query Runs validated query against SQLite Calls validate internally
fix_sql_error LLM rewrites failed queries with error context SELECT only (instructed)

Key design decisions:

  • Safety is enforced in codevalidate_sql_query uses regex, not LLM judgment
  • Self-healing — the fix_sql_error + retry loop handles LLM SQL generation mistakes automatically
  • Schema in system prompt — the agent always has full schema context, reducing hallucinated column names
  • LangGraph workflow — the StateGraph manages the agent-tool loop cleanly with conditional edges

Found this useful? Keep building with me.

New tutorials every week on YouTube — or go deeper with a full structured course.

Find this tutorial useful?

Subscribe to our YouTube channels for more practical production walk-throughs.

Discussion & Comments