This lesson builds a complete chatbot UI on top of the RunnableWithMessageHistory pattern from the Chat Message Memory guide.
The app adds:
- A Streamlit web interface with a real chat layout (
st.chat_message,st.chat_input) - Token-by-token streaming so users see the response as it is generated
- A user ID input to support multiple independent sessions
- A "Start New Conversation" button to wipe history and begin fresh
Prerequisites: All previous lessons completed. Install Streamlit: pip install streamlit. Ollama running with qwen3.
Full Application Code
Save this as chat_stream.py and run with streamlit run chat_stream.py.
# chat_stream.py
import streamlit as st
from dotenv import load_dotenv
from langchain_ollama import ChatOllama
from langchain_core.prompts import (
SystemMessagePromptTemplate,
HumanMessagePromptTemplate,
ChatPromptTemplate,
MessagesPlaceholder
)
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain_community.chat_message_histories import SQLChatMessageHistory
from langchain_core.output_parsers import StrOutputParser
load_dotenv('./../.env')
st.title("Make Your Own Chatbot")
st.write("Chat with me! Catch me at https://youtube.com/kgptalkie")
base_url = "http://localhost:11434"
model = 'qwen3'
user_id = st.text_input("Enter your user id", "default_user")
def get_session_history(session_id):
return SQLChatMessageHistory(session_id, connection="sqlite:///chat_history.db")
if "chat_history" not in st.session_state:
st.session_state.chat_history = []
if st.button("Start New Conversation"):
st.session_state.chat_history = []
history = get_session_history(user_id)
history.clear()
for message in st.session_state.chat_history:
with st.chat_message(message['role']):
st.markdown(message['content'])
### LLM Setup
llm = ChatOllama(base_url=base_url, model=model)
system = SystemMessagePromptTemplate.from_template("You are helpful assistant.")
human = HumanMessagePromptTemplate.from_template("{input}")
messages = [system, MessagesPlaceholder(variable_name='history'), human]
prompt = ChatPromptTemplate(messages=messages)
chain = prompt | llm | StrOutputParser()
runnable_with_history = RunnableWithMessageHistory(
chain,
get_session_history,
input_messages_key='input',
history_messages_key='history'
)
def chat_with_llm(session_id, input):
for output in runnable_with_history.stream(
{'input': input},
config={'configurable': {'session_id': session_id}}
):
yield output
prompt = st.chat_input("What is up?")
if prompt:
st.session_state.chat_history.append({'role': 'user', 'content': prompt})
with st.chat_message("user"):
st.markdown(prompt)
with st.chat_message("assistant"):
response = st.write_stream(chat_with_llm(user_id, prompt))
st.session_state.chat_history.append({'role': 'assistant', 'content': response})
Code Walkthrough
1. Load Environment Variables
load_dotenv('./../.env')
Loads LANGFUSE_SECRET_KEY, LANGFUSE_PUBLIC_KEY, or any other keys from .env. On Windows, adjust the path to '.env' if your .env file is in the same directory as the script.
2. Page Header and User ID Input
st.title("Make Your Own Chatbot")
st.write("Chat with me! Catch me at https://youtube.com/kgptalkie")
user_id = st.text_input("Enter your user id", "default_user")
user_id is the session_id passed to SQLChatMessageHistory. Different users get completely isolated conversation histories stored in separate rows in chat_history.db.
3. History Factory
def get_session_history(session_id):
return SQLChatMessageHistory(session_id, connection="sqlite:///chat_history.db")
Creates (or opens) a SQLite database named chat_history.db in the script's directory. Returns a history object scoped to the given session_id.
4. Streamlit Session State for Display History
if "chat_history" not in st.session_state:
st.session_state.chat_history = []
Streamlit reruns the entire script on every user action. st.session_state persists values across reruns. chat_history is a Python list of {'role': ..., 'content': ...} dicts used only to redraw the chat bubbles — it is separate from the SQL history.
5. Start New Conversation Button
if st.button("Start New Conversation"):
st.session_state.chat_history = []
history = get_session_history(user_id)
history.clear()
Two things happen on click:
- Display history (
st.session_state.chat_history) is cleared — the chat bubbles disappear - SQL history (
history.clear()) is wiped — the model loses all context for thissession_id
Note
If the user changes the user_id text input after a conversation, a new empty session starts automatically — no button click needed. The old session's SQL history remains intact in the database.
6. Redrawing Existing Chat Bubbles
for message in st.session_state.chat_history:
with st.chat_message(message['role']):
st.markdown(message['content'])
On every rerun, this loop renders all prior messages from session_state into the chat layout. Without this, the conversation would disappear on every new message.
7. LLM and Chain Setup
llm = ChatOllama(base_url=base_url, model=model)
system = SystemMessagePromptTemplate.from_template("You are helpful assistant.")
human = HumanMessagePromptTemplate.from_template("{input}")
messages = [system, MessagesPlaceholder(variable_name='history'), human]
prompt = ChatPromptTemplate(messages=messages)
chain = prompt | llm | StrOutputParser()
MessagesPlaceholder(variable_name='history')reserves the slot where conversation history is injected between the system message and the current user input- The chain is
prompt | llm | StrOutputParser()— same as all previous LCEL examples
8. Wrapping with Memory
runnable_with_history = RunnableWithMessageHistory(
chain,
get_session_history,
input_messages_key='input',
history_messages_key='history'
)
input_messages_key='input'— the{'input': ...}dict key that holds the user's current messagehistory_messages_key='history'— must matchMessagesPlaceholder'svariable_name
9. Streaming Generator
def chat_with_llm(session_id, input):
for output in runnable_with_history.stream(
{'input': input},
config={'configurable': {'session_id': session_id}}
):
yield output
.stream() (not .invoke()) returns an iterator that yields string chunks as tokens arrive from the LLM. The function is a generator (yield) so it can be consumed lazily by st.write_stream().
Tip
This is the key difference from the notebook version: .stream() instead of .invoke(). The SQL history is still written after the full response is assembled — streaming only affects what the user sees in real time.
10. Chat Input and Response
prompt = st.chat_input("What is up?")
if prompt:
st.session_state.chat_history.append({'role': 'user', 'content': prompt})
with st.chat_message("user"):
st.markdown(prompt)
with st.chat_message("assistant"):
response = st.write_stream(chat_with_llm(user_id, prompt))
st.session_state.chat_history.append({'role': 'assistant', 'content': response})
Step by step:
st.chat_inputrenders the text box at the bottom of the page and returns the submitted text (orNoneif nothing submitted)- The user message is appended to
session_state.chat_historyand displayed immediately st.write_stream()consumes the generator and renders tokens one-by-one into the assistant chat bubble as they arriveresponseis the fully assembled string returned byst.write_stream()after streaming completes — it is then saved tosession_statefor redrawing on the next rerun
Running the App
# Windows
streamlit run chat_stream.py
# Linux / macOS
streamlit run chat_stream.py
Open http://localhost:8501 in your browser. The app starts with an empty chat. Type a message, press Enter, and watch the assistant stream its reply token by token.
Important
The user_id text input is evaluated before any chat is rendered. If you change user_id mid-conversation, the chat_history in session_state still shows the old messages visually — but the model will use the new session's SQL history. Click "Start New Conversation" after changing user_id to sync them.
Architecture Overview
Browser (Streamlit UI)
│
├── st.text_input(user_id) ← selects the session
├── st.button("Start New") ← clears SQL + display history
├── st.chat_message (loop) ← redraws prior messages
├── st.chat_input ← captures new user message
│
▼
chat_with_llm(user_id, prompt) ← generator using .stream()
│
▼
RunnableWithMessageHistory
├── get_session_history(user_id) ← loads from SQLite
│ └── SQLChatMessageHistory ← chat_history.db
│
├── ChatPromptTemplate
│ ├── SystemMessage
│ ├── MessagesPlaceholder ← history injected here
│ └── HumanMessage {input}
│
├── ChatOllama (qwen3) ← streams tokens
└── StrOutputParser ← yields string chunks
│
▼
st.write_stream() ← renders chunks in real-time
What You Built
You now have a complete, production-style chatbot application:
| Feature | Implementation |
|---|---|
| Streaming output | runnable_with_history.stream() + st.write_stream() |
| Persistent memory | SQLChatMessageHistory → chat_history.db |
| Multi-user sessions | session_id from st.text_input |
| Conversation reset | history.clear() + session_state.chat_history = [] |
| History display on rerun | Loop over st.session_state.chat_history |
| Prompt template with history | MessagesPlaceholder in ChatPromptTemplate |
The same pattern — RunnableWithMessageHistory + SQLChatMessageHistory + a Streamlit streaming UI — scales directly to production chatbots backed by PostgreSQL, Redis, or any other BaseChatMessageHistory implementation LangChain supports.