Gemini 3 and LangChain Bootcamp

Google's Gemini 3 model family introduces state-of-the-art reasoning, million-token context windows, multimodal intelligence, and built-in execution tools. Integrated with LangChain, developers can leverage these capabilities to construct structured, private, and highly adaptive agent systems.

This bootcamp walks through API key initialization, environment configuration, basic and streamed prompt execution, multimodal file analysis, tool calling, reasoning configuration, and context caching.

API Key and Environment Setup

To deploy Gemini 3 and track its calls, you must configure API access for both Google AI Studio and LangSmith.

Part 1: Creating a Gemini API Key

Go to Google AI Studio and sign in with your Google account.
Click "Get API Key" in the left sidebar, then click the "Create API Key" button.
Select an existing Google Cloud project or create a new one.
Generate the key, copy it immediately, and save it in a .env file in your project root:

PYTHON

GOOGLE_API_KEY=your_gemini_api_key_here

Part 2: Setting up LangSmith Tracing

Go to LangSmith and register an account.
Click your profile icon (top-right), navigate to "Settings", and click "API Keys" in the left sidebar.
Click "Create API Key" and give it a descriptive name (e.g., "Multi-Agent Deep RAG").
Save the generated key and trace parameters into your .env file:

PYTHON

# LANGSMITH_API_KEY="your_api_key_here"
# LANGSMITH_TRACING=true
# LANGSMITH_ENDPOINT="https://api.smith.langchain.com"
# LANGCHAIN_PROJECT="multi-agent-deep-rag"

Testing the Configuration

Install the required packages:

BASH

pip install python-dotenv langchain-google-genai google-genai langsmith

On Linux/macOS: The command is identical.

Verify the loaded environment keys in your python session:

PYTHON

import os
from dotenv import load_dotenv

load_dotenv()

google_api_key = os.getenv("GOOGLE_API_KEY")
langchain_api_key = os.getenv("LANGSMITH_API_KEY")

if google_api_key:
    print("Gemini API Key loaded successfully")
else:
    print("Gemini API Key not found")

if langchain_api_key:
    print("Langsmith API Key loaded successfully")
else:
    print("Langsmith API Key not found")

OUTPUT

Gemini API Key loaded successfully
Langsmith API Key loaded successfully

Test connection to the model:

PYTHON

from google import genai

client = genai.Client()

response = client.models.generate_content(
    model="gemini-3-pro-preview",
    contents="Explain how AI works in a few words",
)

print(response.text)

OUTPUT

It processes massive amounts of data to **recognize patterns** and **make predictions**.

In short: **Data + Math = Predictions.**

Ensure LangSmith connection functions correctly:

PYTHON

from langsmith import Client

client = Client()

try:
    projects = list(client.list_projects(limit=1))
    print("Langsmith connection successful!")
    print(f"Connected to project: {os.getenv('LANGCHAIN_PROJECT')}")
except Exception as e:
    print(f"Langsmith connection failed: {e}")

OUTPUT

Langsmith connection successful!
Connected to project: multi-agent-deep-rag

Google Gemini 3 Model Lineup

Gemini 3 offers specialized variants tuned for specific latency, cost, and reasoning profiles:

Model	Context Window (In/Out)	Recommended Use Case
`gemini-3-pro-preview`	1M / 64k	Complex reasoning, code generation, structural analysis
`gemini-3-pro-image-preview`	65k / 32k	Image generation, image-to-image editing, graphic rendering
`gemini-2.5-flash`	1M / 8k	Fast, cost-efficient processing, standard text operations

Key Improvements

Thinking Level: Reasoning depth configuration (low or high).
Media Resolution: Configurable processing resolution per media asset (low, medium, high, ultra_high).
Thought Signatures: Automatic internal recording and preservation of the reasoning chain.

Basic Invocation and Token Streaming

Basic Message Structure

Invoke the model using system and user message formats:

PYTHON

from langchain_google_genai import ChatGoogleGenerativeAI
from langchain.messages import HumanMessage, SystemMessage

gemini3 = 'gemini-3-pro-preview'
gemini2 = 'gemini-2.5-flash'

system_msg = SystemMessage("You are a helpful AI Assistant")
query = HumanMessage("Explain the theory of relativity in simple terms")

messages = [system_msg, query]

model = ChatGoogleGenerativeAI(model=gemini3)
response = model.invoke(messages)

print(response.content)

OUTPUT

The Theory of Relativity, developed by Albert Einstein, is split into two parts:

1. **Special Relativity**: Time and space are relative and depend on how fast you are moving. The speed of light is the universal constant. As speed increases, time slows down (time dilation) and space contracts.
2. **General Relativity**: Gravity is not a pulling force, but rather the bending and warping of spacetime fabric caused by mass and energy. Heavy objects create a "dent" in spacetime that smaller objects roll along.

Real-Time Token Streaming

Stream tokens chunk-by-chunk to improve responsiveness in visual UIs:

PYTHON

model = ChatGoogleGenerativeAI(model=gemini2)
query = "Explain the theory of relativity in simple terms."

for chunk in model.stream(query):
    print(chunk.text, end="", flush=True)

OUTPUT

Okay, let's break down the theory of relativity...

Multimodal Document Analysis

Gemini 3 processes images, audio, video, and PDF files natively.

Image Analysis (URL/Local File)

You can pass image URLs directly to the model inside a content dictionary:

PYTHON

model = ChatGoogleGenerativeAI(model=gemini3)

human_msg = HumanMessage(
    [
        {'type': 'text', 'text': 'Describe the image provided'},
        {'type': 'image',
         'url': 'https://www.shutterstock.com/image-vector/vector-cute-baby-panda-cartoon-600nw-2427356853.jpg'}
    ]
)

response = model.invoke([system_msg, human_msg])
print(response.text)

OUTPUT

The image features an adorable cartoon baby panda sitting upright and smiling, framed by green grass on a white background.

For local images, base64 encode the asset and define its MIME type:

PYTHON

import base64

mime_type = "image/png"

image_bytes = open("data/images/panda.png", 'rb').read()
bytes_base64 = base64.b64encode(image_bytes).decode('utf-8')

human_msg = HumanMessage(
    [
        {'type': 'text', 'text': 'Describe the image provided'},
        {'type': 'image',
         'base64': bytes_base64,
         "mime_type": mime_type}
    ]
)

response = model.invoke([system_msg, human_msg])
print(response.text)

OUTPUT

This is a cartoon illustration of a baby panda cub with round pink ears sitting in green grass.

PDF Document Analysis

Analyzing multi-page documents (such as financial SEC filings) follows the same base64 encoding strategy:

PYTHON

pdf_bytes = open(r'data\rag-data\pdfs\apple\apple 10-q q1 2024.pdf', 'rb').read()
pdf_base64 = base64.b64encode(pdf_bytes).decode('utf-8')

mime_type = "application/pdf"

human_msg = HumanMessage(
    [
        {'type': 'text', 'text': 'summarize the key financial highlights from this quarterly report.'},
        {'type': 'file',
         'base64': pdf_base64,
         'mime_type': mime_type}
    ]
)

model = ChatGoogleGenerativeAI(model=gemini2)
response = model.invoke([system_msg, human_msg])
print(response.text)

OUTPUT

Key Financial Highlights:
- Total Net Sales: Decreased by 4% to $90.75 billion.
- Services Sales: Increased by 14% to $23.87 billion.
- Diluted EPS: Increased slightly to $1.53.

Note

For large PDF inputs, setting media_resolution parameters can improve document text clarity during extraction.

Tool Calling and Native Functions

Gemini models support custom tool bindings as well as native Google-provided execution tools.

Binding Custom Tools

Declare a schema or python tool and bind it directly to the model executor:

PYTHON

from scripts import base_tools

model = ChatGoogleGenerativeAI(model=gemini2)
model_with_tools = model.bind_tools([base_tools.web_search, base_tools.get_weather])

response = model_with_tools.invoke("What is the weather in mumbai?")
print(response.tool_calls)

OUTPUT

[{'name': 'get_weather', 'args': {'location': 'Mumbai'}, 'id': '...'}]

Google Search and Python Code Interpreter

Enable native Google Search grounding or python code sandbox execution directly inside the API configuration:

PYTHON

model = ChatGoogleGenerativeAI(model=gemini2)
model_with_tools = model.bind_tools([{'google_search': {}}, {'code_execution': {}}])

query = "When is the next total solar eclipse in the US and what is 3 + 2?"
response = model_with_tools.invoke(query)
print(response.text)

OUTPUT

The answer to 3 + 2 is 5. The next total solar eclipse in the US will be on March 30, 2033, visible only in Alaska.

Warning

C:\Users\your-username\anaconda3\envs\ml\Lib\site-packages\langchain_google_genai\chat_models.py:1052 Replace your-username with your actual Windows username. Output structures may vary per execution. Ensure you validate server-returned schemas before parsing them in production pipelines.

Context Caching

When performing consecutive queries against large files, you can cache target documents in Gemini's server memory to minimize token costs and network latency. Caching requires a minimum threshold of 2,048 tokens.

PYTHON

import time
from google import genai
from google.genai.types import CreateCachedContentConfig, Content, Part

client = genai.Client()

file_paths = [
    "data/rag-data/pdfs/apple/apple 10-q q1 2024.pdf",
    "data/rag-data/pdfs/apple/apple 10-q q2 2024.pdf"
]

uploaded_files = []
for path in file_paths:
    file = client.files.upload(file=path)
    while file.state.name == "PROCESSING":
        time.sleep(2)
        file = client.files.get(name=file.name)
    uploaded_files.append(file)

Create the server-side cache container:

PYTHON

parts = [Part.from_uri(file_uri=f.uri, mime_type=f.mime_type) for f in uploaded_files]
contents = [Content(role='user', parts=parts)]

cache = client.caches.create(
    model=gemini2,
    config=CreateCachedContentConfig(
        display_name='Apple Q1 Q2 2024 reports',
        system_instruction="You are a financial analyst. Use these Apple quarterly reports to answer questions.",
        contents=contents,
        ttl='1800s'  # Cache lives for 30 minutes
    )
)

Invoke queries referencing the active cache:

PYTHON

model = ChatGoogleGenerativeAI(
    model=gemini2,
    cached_content=cache.name
)

query = "Compare the revenue growth between Q1 and Q2 2024."
response = model.invoke(query)

print(response.usage_metadata)

OUTPUT

{'input_tokens': 14482, 'output_tokens': 2128, 'total_tokens': 16610, 'input_token_details': {'cache_read': 14465}}

Notice 'cache_read': 14465 in usage_metadata — this indicates the large documents were successfully read from the server cache, bypassing token transfer costs.

Image Generation with Grounding

Generate graphics up to 4K resolution using gemini-3-pro-image-preview. Aspect ratios supported include "16:9", "1:1", "2:3", and "21:9".

PYTHON

from langchain_google_genai import Modality

image_model = ChatGoogleGenerativeAI(model="gemini-3-pro-image-preview")
image_content = "Create a professional infographic of Apple Q2 earnings."

image_response = image_model.invoke(
    image_content, response_modalities=[Modality.TEXT, Modality.IMAGE]
)

# Extract and save the generated image
with open("data/images/apple_info.png", 'wb') as f:
    f.write(base64.b64decode(image_response.content_blocks[0]['base64']))

Gemini 3 and LangChain Bootcamp

Agentic RAG with LangChain and LangGraph - Ollama

API Key and Environment Setup

Part 1: Creating a Gemini API Key

Part 2: Setting up LangSmith Tracing

Testing the Configuration

Google Gemini 3 Model Lineup

Key Improvements

Basic Invocation and Token Streaming

Basic Message Structure

Real-Time Token Streaming

Multimodal Document Analysis

Image Analysis (URL/Local File)

PDF Document Analysis

Tool Calling and Native Functions

Binding Custom Tools

Google Search and Python Code Interpreter

Context Caching

Image Generation with Grounding

Found this useful? Keep building with me.

Latest recommendations you might like

Production Deployment of LangChain DeepAgent Systems

Orchestrating Hierarchical Multi-Agent Research Teams

Decomposing Financial Queries with TODO Planning Agents

Building a Dual-Tool Multimodal Financial Research Agent

Find this tutorial useful?

Discussion & Comments