#langchain#streamlit#deployment#resume-parsing#pymupdf#ui#python

Deploy Resume Parsing with Streamlit

Wrap the two-stage LLM resume parsing pipeline into an interactive web application using Streamlit, enabling users to upload PDFs and view extracted JSON data in real time.

Jun 4, 2026 at 10:30 AM4 min readFollowFollow (Hindi)

Topics You Will Master

Understanding the difference between script-based PDF processing and handling uploaded files in memory
Reading byte arrays from a Streamlit file_uploader directly into PyMuPDF without saving to disk
Integrating the modular two-stage LLM pipeline (ask_llm and validate_json) into a web app
Enhancing UX with Streamlit's st.spinner() to provide feedback during long LLM calls
Rendering the final validated JSON directly in the browser
Best For

Developers who have built LLM pipelines in Jupyter Notebooks and need to expose them as interactive, production-ready web interfaces for end users.

Expected Outcome

A complete, functioning app.py Streamlit application that accepts PDF uploads, runs the LangChain resume parsing pipeline, and displays the structured JSON extraction in the browser.

In the previous lesson, we built a highly reliable two-stage LLM pipeline that extracts structured JSON data from PDF resumes. However, running a Jupyter Notebook is not a viable production deployment.

In this lesson, we wrap that exact pipeline in a Streamlit web application. Because we separated our LLM logic into scripts/llm.py, deploying the app requires writing fewer than 40 lines of UI code.

Prerequisites: Ensure you have completed the Resume Parsing lesson. You will need streamlit, pymupdf, and the scripts/llm.py module from the previous section.

BASH
pip install streamlit

LangChain & Ollama — Local AI Development

Build production-ready LLM apps entirely on your own hardware. No API keys, no cloud costs.

Enroll on Udemy →

The Application Code (app.py)

Create an app.py file in the same directory as your scripts/ folder.

1. Imports and Basic Setup

PYTHON
import streamlit as st
import pymupdf
from scripts.llm import ask_llm, validate_json

st.title("Resume Parsing")
st.write("Upload a resume in PDF format to extract information")

We import streamlit for the UI, pymupdf to handle the PDF text extraction, and our two custom LangChain functions (ask_llm and validate_json) from the scripts.llm module.

2. Handling File Uploads in Memory

In a notebook, we loaded PDFs from a hardcoded file path on disk. In a web app, users upload files directly from their browser. To maximize performance and security, we process the uploaded file in memory as a byte stream rather than saving it to the server's hard drive.

PYTHON
uploaded_file = st.file_uploader("Choose a file")

if uploaded_file is not None:
    # Read the uploaded file into memory as bytes
    bytearray = uploaded_file.read()
    
    # Open the byte stream with PyMuPDF
    pdf = pymupdf.open(stream=bytearray, filetype="pdf")

    context = ""
    # Extract text from every page
    for page in pdf:
        context = context + "\n\n" + page.get_text()

    pdf.close()

3. Executing the Pipeline with UX Feedback

LLM calls take time — especially local ones running on Ollama. If the app freezes while processing, users will assume it broke and refresh the page.

We wrap our pipeline calls in st.spinner() blocks to provide visual feedback while the user waits.

PYTHON
question = """You are tasked with parsing a job resume. Your goal is to extract relevant information in a valid structured 'JSON' format. 
                Do not write preambles or explanations."""

if st.button("Parse Resume"):
    # Run the first LLM pass (Semantic Extraction)
    with st.spinner("Parsing Resume..."):
        response = ask_llm(context=context, question=question)

    # Run the second LLM pass (JSON Validation)
    with st.spinner("Validating JSON..."):
        response = validate_json(response)
    
    # Display the final output
    st.write("**Extracted Information**")
    st.write(response)

    st.write("You can copy the JSON output and use it in your application.")

    # Show a celebration animation on success!
    st.balloons()

Running the Application

To start the server, run the following command in your terminal from the directory containing app.py:

BASH
streamlit run app.py

Streamlit will launch a local web server (typically at http://localhost:8501) and automatically open it in your default browser.

The User Flow

  1. The user clicks "Browse files" and selects a PDF resume.
  2. The user clicks the "Parse Resume" button.
  3. The UI shows a spinning "Parsing Resume..." indicator while the StrOutputParser chain extracts the text.
  4. The UI changes to "Validating JSON..." while the JsonOutputParser chain strictly formats the output.
  5. Balloons animate on the screen, and the structured JSON dictionary is rendered cleanly on the page.

Deployment Considerations

While this app runs perfectly on your local machine using Ollama, deploying it to a public cloud (like AWS, Render, or Streamlit Community Cloud) requires a few adjustments:

  1. Local vs. Cloud LLMs: Ollama runs locally on your machine. If you deploy this Streamlit app to the cloud, you must either deploy Ollama to a cloud server (which requires expensive GPU instances) or change the model in scripts/llm.py to a managed API like OpenAI (ChatOpenAI), Anthropic (ChatAnthropic), or AWS Bedrock.
  2. State Management: If multiple users upload resumes simultaneously, you may need to use Streamlit's @st.cache_data or st.session_state to ensure UI state doesn't leak between interactions.
  3. File Size Limits: Ensure your web server or reverse proxy (like Nginx) is configured to accept files large enough for standard PDF resumes (e.g., 5-10MB).

What You Built

In this final lesson, you completed the full journey from raw data to a production-ready application:

  • You designed a modular LLM architecture, separating backend LangChain logic (llm.py) from frontend UI logic (app.py).
  • You processed in-memory file uploads using PyMuPDF instead of reading from disk.
  • You implemented asynchronous UI feedback using st.spinner() to keep users engaged during long LLM inferences.
  • You deployed a working Streamlit app that parses unstructured resumes into structured, machine-readable JSON in real time.

Find this tutorial useful?

Subscribe to our YouTube channels for more practical production walk-throughs.

Discussion & Comments