Output parsers bridge the gap between an LLM's free-form text response and the structured Python objects your application actually needs. Instead of writing regex or json.loads() manually, you define a schema once and let the parser handle formatting instructions, parsing, and validation.
Every LangChain output parser implements two core methods:
get_format_instructions()— returns a string of instructions telling the model exactly how to format its responseparse()— takes the model's raw string response and returns a structured Python object
Prerequisites: LangChain installed with langchain-ollama and python-dotenv. Ollama running with qwen3. Familiar with LangChain chains — see the Chains guide.
Available Output Parsers
| Parser | Output Type | Notes |
|---|---|---|
StrOutputParser |
str |
Extracts .content from AIMessage as plain text |
PydanticOutputParser |
Pydantic BaseModel instance |
Validates against a schema; raises on invalid JSON |
JsonOutputParser |
dict |
Returns a raw Python dict; less strict than Pydantic |
CommaSeparatedListOutputParser |
list[str] |
Splits comma-separated model output into a Python list |
DatetimeOutputParser |
— | Not available in LangChain v1 |
Setup
from dotenv import load_dotenv
load_dotenv('.env')
True
On Linux/macOS: use load_dotenv('./../.env') if your .env is in a parent directory.
from langchain_ollama import ChatOllama
from langchain_core.prompts import (
SystemMessagePromptTemplate,
HumanMessagePromptTemplate,
ChatPromptTemplate,
PromptTemplate
)
base_url = "http://localhost:11434"
model = 'qwen3'
llm = ChatOllama(base_url=base_url, model=model)
Pydantic Output Parser
PydanticOutputParser validates the model's JSON response against a Pydantic schema. If the model returns malformed JSON or a missing required field, Pydantic raises a validation error.
Defining the Schema
from typing import Optional
from pydantic import BaseModel, Field
from langchain_core.output_parsers import PydanticOutputParser
class Joke(BaseModel):
"""Joke to tell user"""
setup: str = Field(description="The setup of the joke")
punchline: str = Field(description="The punchline of the joke")
rating: Optional[int] = Field(description="The rating of the joke is from 1 to 10", default=None)
Creating the Parser
parser = PydanticOutputParser(pydantic_object=Joke)
Inspecting Format Instructions
get_format_instructions() generates the JSON schema description that you inject into the prompt so the model knows what format to produce:
instruction = parser.get_format_instructions()
print(instruction)
The output should be formatted as a JSON instance that conforms to the JSON schema below.
As an example, for the schema {"properties": {"foo": {"title": "Foo", "description": "a list of strings", "type": "array", "items": {"type": "string"}}}, "required": ["foo"]}
the object {"foo": ["bar", "baz"]} is a well-formatted instance of the schema. The object {"properties": {"foo": ["bar", "baz"]}} is not well-formatted.
Here is the output schema:
{"description": "Joke to tell user", "properties": {"setup": {"description": "The setup of the joke", "title": "Setup", "type": "string"}, "punchline": {"description": "The punchline of the joke", "title": "Punchline", "type": "string"}, "rating": {"anyOf": [{"type": "integer"}, {"type": "null"}], "default": null, "description": "The rating of the joke is from 1 to 10", "title": "Rating"}}, "required": ["setup", "punchline"]}
Building the Prompt with Injected Instructions
Use partial_variables to bake the format instructions into the prompt template so they are always included automatically:
prompt = PromptTemplate(
template='''
Answer the user query with a joke. Here is your formatting instruction.
{format_instruction}
Query: {query}
Answer:''',
input_variables=['query'],
partial_variables={'format_instruction': parser.get_format_instructions()}
)
chain = prompt | llm
Checking Raw LLM Output (Without Parser)
Run the chain without the parser first to see the raw JSON string the model produces:
output = chain.invoke({'query': 'Tell me a joke about the cat'})
print(output.content)
{
"setup": "Why did the cat bring a ladder to the party?",
"punchline": "They wanted to climb the social ladder!",
"rating": 7
}
Adding the Parser to the Chain
Append the parser to convert the raw JSON string into a validated Joke instance:
chain = prompt | llm | parser
output = chain.invoke({'query': 'Tell me a joke about the dogs'})
print(output)
setup='Why did the dog bring a ladder to the park?' punchline='He heard the treats were on a high shelf!' rating=8
output is now a Joke Pydantic object. Access fields directly:
output.setup # "Why did the dog bring a ladder to the park?"
output.punchline # "He heard the treats were on a high shelf!"
output.rating # 8
Structured Output with .with_structured_output()
.with_structured_output() is a simpler alternative that skips the manual prompt injection step. You attach the schema directly to the LLM and the method handles format instructions internally.
Baseline (Unstructured)
output = llm.invoke('Tell me a joke about the cat')
print(output.content)
Why did the cat join a band?
Because it wanted to be in the **meow-sic** industry! 🎵🐾
*(Bonus: The band's name was "Whisker & The Purr-fectionists.")* 😸
With Structured Output
structured_llm = llm.with_structured_output(Joke)
output = structured_llm.invoke('Tell me a joke about the cat')
print(output)
setup='Why did the cat sit on the computer?' punchline='To keep the mouse away!' rating=None
Note
rating is None here because the field is Optional and the model didn't include it. PydanticOutputParser with explicit format instructions tends to produce more complete output including optional fields. .with_structured_output() is faster to set up but gives the model less explicit guidance.
Tip
.with_structured_output() also accepts a TypedDict class or a JSON Schema dict as its argument — not just Pydantic models.
JSON Output Parser
JsonOutputParser returns a raw Python dict without Pydantic validation. Use it when you want structured output but don't need field-level type enforcement.
from langchain_core.output_parsers import JsonOutputParser
parser = JsonOutputParser(pydantic_object=Joke)
print(parser.get_format_instructions())
The output should be formatted as a JSON instance that conforms to the JSON schema below.
As an example, for the schema {"properties": {"foo": {"title": "Foo", "description": "a list of strings", "type": "array", "items": {"type": "string"}}}, "required": ["foo"]}
the object {"foo": ["bar", "baz"]} is a well-formatted instance of the schema. The object {"properties": {"foo": ["bar", "baz"]}} is not well-formatted.
Here is the output schema:
{"description": "Joke to tell user", "properties": {"setup": {...}, "punchline": {...}, "rating": {...}}, "required": ["setup", "punchline"]}
Building and Running the JSON Chain
prompt = PromptTemplate(
template='''
Answer the user query with a joke. Here is your formatting instruction.
{format_instruction}
Query: {query}
Answer:''',
input_variables=['query'],
partial_variables={'format_instruction': parser.get_format_instructions()}
)
chain = prompt | llm
output = chain.invoke({'query': 'Tell me a joke about the cat'})
print(output.content)
{
"setup": "Why did the cat sit on the computer?",
"punchline": "Because it heard there was a mouse in the keyboard!",
"rating": 7
}
Append the JsonOutputParser to get a Python dict directly:
chain = prompt | llm | parser
output = chain.invoke({'query': 'Tell me a joke about the cat'})
print(output)
{
"setup": "Why did the cat sit on the keyboard?",
"punchline": "Because it wanted to keep the mouse in the house.",
"rating": 8
}
Access values like any Python dict:
output['setup'] # 'Why did the cat sit on the keyboard?'
output['rating'] # 8
Note
JsonOutputParser also supports streaming — it incrementally yields partial dicts as tokens arrive, unlike PydanticOutputParser which must wait for the complete response to validate.
CSV Output Parser
CommaSeparatedListOutputParser is the simplest parser for list-type outputs. The model is instructed to return a comma-separated string which the parser splits into a Python list[str].
from langchain_core.output_parsers import CommaSeparatedListOutputParser
parser = CommaSeparatedListOutputParser()
print(parser.get_format_instructions())
Your response should be a list of comma separated values, eg: `foo, bar, baz` or `foo,bar,baz`
Building the CSV Chain
format_instruction = parser.get_format_instructions()
prompt = PromptTemplate(
template='''
Answer the user query with a list of values. Here is your formatting instruction.
{format_instruction}
Query: {query}
Answer:''',
input_variables=['query'],
partial_variables={'format_instruction': format_instruction}
)
chain = prompt | llm | parser
output = chain.invoke({'query': 'generate my website seo keywords. I have content about the NLP and LLM.'})
print(output)
The output is a Python list of strings, for example:
[
"NLP",
"LLM",
"natural language processing",
"large language models",
"AI",
"machine learning",
"text processing",
"deep learning",
"transformers",
"chatbots"
]
On Linux/macOS: all code above runs identically — no OS-specific differences.
Quick Reference
Parser Comparison
| Parser | Import | Output type | Best for |
|---|---|---|---|
StrOutputParser |
langchain_core.output_parsers |
str |
Plain text extraction |
PydanticOutputParser |
langchain_core.output_parsers |
Pydantic model | Validated typed structs |
JsonOutputParser |
langchain_core.output_parsers |
dict |
Structured output with streaming |
CommaSeparatedListOutputParser |
langchain_core.output_parsers |
list[str] |
Simple enumeration |
Two Ways to Get Structured Output
# Method 1 — PydanticOutputParser (explicit, more control)
parser = PydanticOutputParser(pydantic_object=Joke)
prompt = PromptTemplate(
template="...\n{format_instruction}\n\nQuery: {query}\nAnswer:",
input_variables=['query'],
partial_variables={'format_instruction': parser.get_format_instructions()}
)
chain = prompt | llm | parser
output = chain.invoke({'query': 'Tell me a joke about cats'})
# output is a Joke instance
# Method 2 — with_structured_output (simpler, less control)
structured_llm = llm.with_structured_output(Joke)
output = structured_llm.invoke('Tell me a joke about cats')
# output is a Joke instance
Key Imports
from langchain_core.output_parsers import (
StrOutputParser,
PydanticOutputParser,
JsonOutputParser,
CommaSeparatedListOutputParser,
)
from pydantic import BaseModel, Field
from typing import Optional
What You Built
In this lesson you went from receiving a raw AIMessage string to getting fully typed, validated Python objects directly out of an LCEL chain.
Here is what each parser gives you:
PydanticOutputParser— the most explicit approach: injects a JSON schema into the prompt viaget_format_instructions(), then validates and deserializes the response into a Pydantic model. Best when field-level validation matters..with_structured_output()— the fastest path: bind the schema to the LLM directly and skip manual prompt injection. Best for quick prototyping or when the model reliably follows implicit instructions.JsonOutputParser— returns a raw Pythondictwithout Pydantic validation. Supports streaming, making it the right choice when you want partial JSON results as tokens arrive.CommaSeparatedListOutputParser— the simplest option: instructs the model to return a comma-separated string and splits it into alist[str]. Ideal for keyword generation, tag lists, or any enumeration.
The partial_variables pattern — baking get_format_instructions() into the PromptTemplate once — means your chain always sends the correct schema to the model without you needing to pass it manually on each .invoke() call.