Fasttrack· 03· 20 min

Structured Output with Pydantic

What you'll learn
  • Get typed Python objects from LLM responses
  • Define schemas with Pydantic BaseModel
  • Use nested models and Literal enums for strict validation

The problem: raw strings are useless for code

When you ask an LLM "What is the weather in Paris?" you might get back "The temperature is about 22 degrees and it is partly cloudy." That is a fine sentence but impossible to use in code. You cannot reliably extract the number 22, you cannot compare it to a threshold, and you cannot serialize it to a database. Every production LLM application needs typed data, not prose. LangChain's with_structured_output() solves this by instructing the model to return data that matches your Pydantic schema exactly.

Diagram showing raw LLM text being converted to a typed Pydantic object
Click to zoom
with_structured_output() transforms raw text into typed Python objects

Define a Pydantic model

A Pydantic BaseModel defines both the shape of the data and the descriptions the LLM sees. The Field() description is literally part of the prompt — it tells the model what each field means and how to fill it in. Clear, precise field descriptions dramatically improve extraction accuracy.

structured_output.py
from pydantic import BaseModel, Field
from typing import Literal

class WeatherReport(BaseModel):
    city: str = Field(description='The name of the city')
    temperature_c: float = Field(description='Temperature in degrees Celsius')
    conditions: str = Field(description='Brief weather description, e.g. sunny, rainy')
    humidity_percent: int = Field(description='Relative humidity as an integer 0-100')
BaseModel from Pydantic — all structured output schemas inherit from this
Field(description=...) — this text is sent to the LLM as part of the schema prompt
Python type hints (str, float, int) enforce the data types after extraction
The LLM sees the field names and descriptions and fills them like a form

Attach it to the LLM

structured_output.py
structured_model = model.with_structured_output(WeatherReport)

result = structured_model.invoke('What is the weather like in Paris today?')

# result is a WeatherReport instance — fully typed
print(result.city)             # "Paris"
print(result.temperature_c)    # 18.5
print(result.conditions)       # "partly cloudy"
print(result.humidity_percent) # 65

with_structured_output() wraps your model so that instead of returning an AIMessage, invoke() returns an instance of your Pydantic class. Every field is validated and typed. If the LLM returns a float where you asked for an int, Pydantic coerces or raises a validation error. You get a genuine Python object you can pass to any function, serialize to JSON, or store in a database.

Nested schemas and enums

Diagram showing a Team Pydantic model containing a list of Person objects
Click to zoom
Pydantic models can nest — a Team contains a list of Person objects
nested_schema.py
from pydantic import BaseModel, Field
from typing import Literal

class Person(BaseModel):
    name: str = Field(description='Full name of the team member')
    role: Literal['engineer', 'designer', 'pm'] = Field(
        description='Job role: engineer, designer, or pm'
    )
    years_experience: int = Field(description='Years of professional experience')

class Team(BaseModel):
    team_name: str = Field(description='Name of the team')
    members: list[Person] = Field(description='List of team members')

structured_model = model.with_structured_output(Team)
result = structured_model.invoke(
    'Extract the team: Alice is a senior engineer with 8 years, '
    'Bob is a UX designer with 3 years.'
)
for member in result.members:
    print(f'{member.name} ({member.role}) — {member.years_experience} years')
Field descriptions are part of the prompt the LLM sees — make them precise and descriptive. A vague description like description="the number" produces unpredictable results. A precise description like description="Temperature in degrees Celsius, as a float" tells the model exactly what to extract and in what unit.
Knowledge Check
What does model.with_structured_output(MyModel).invoke() return?
Recap — what you just learned
  • with_structured_output(MyModel) wraps any LangChain model to return typed Pydantic objects
  • Field(description=...) is sent to the LLM — precise descriptions improve extraction accuracy
  • Pydantic models can nest — use list[SubModel] to extract arrays of objects
  • Literal types constrain fields to a fixed set of allowed values
Next up: 04 — Memory: Why LLMs Forget