pydantic-ai
├── LICENSE
├── Makefile
├── README.md
├── docs
│ ├── _worker.js
│ ├── agents.md
│ ├── api
│ │ ├── agent.md
│ │ ├── exceptions.md
│ │ ├── format_as_xml.md
│ │ ├── messages.md
│ │ ├── models
│ │ │ ├── anthropic.md
│ │ │ ├── base.md
│ │ │ ├── function.md
│ │ │ ├── gemini.md
│ │ │ ├── groq.md
│ │ │ ├── mistral.md
│ │ │ ├── ollama.md
│ │ │ ├── openai.md
│ │ │ ├── test.md
│ │ │ └── vertexai.md
│ │ ├── result.md
│ │ ├── settings.md
│ │ ├── tools.md
│ │ └── usage.md
│ ├── contributing.md
│ ├── dependencies.md
│ ├── examples
│ │ ├── bank-support.md
│ │ ├── chat-app.md
│ │ ├── flight-booking.md
│ │ ├── index.md
│ │ ├── pydantic-model.md
│ │ ├── rag.md
│ │ ├── sql-gen.md
│ │ ├── stream-markdown.md
│ │ ├── stream-whales.md
│ │ └── weather-agent.md
│ ├── extra
│ │ └── tweaks.css
│ ├── favicon.ico
│ ├── help.md
│ ├── img
│ │ ├── logfire-monitoring-pydanticai.png
│ │ ├── logfire-weather-agent.png
│ │ ├── logo-white.svg
│ │ ├── pydantic-ai-dark.svg
│ │ └── pydantic-ai-light.svg
│ ├── index.md
│ ├── install.md
│ ├── logfire.md
│ ├── message-history.md
│ ├── models.md
│ ├── multi-agent-applications.md
│ ├── results.md
│ ├── testing-evals.md
│ ├── tools.md
│ └── troubleshooting.md
├── examples
│ ├── README.md
│ ├── pydantic_ai_examples
│ │ ├── __main__.py
│ │ ├── bank_support.py
│ │ ├── chat_app.html
│ │ ├── chat_app.py
│ │ ├── chat_app.ts
│ │ ├── flight_booking.py
│ │ ├── pydantic_model.py
│ │ ├── rag.py
│ │ ├── roulette_wheel.py
│ │ ├── sql_gen.py
│ │ ├── stream_markdown.py
│ │ ├── stream_whales.py
│ │ └── weather_agent.py
│ └── pyproject.toml
├── mkdocs.insiders.yml
├── mkdocs.yml
├── pydantic_ai_slim
│ ├── README.md
│ ├── pydantic_ai
│ │ ├── __init__.py
│ │ ├── _griffe.py
│ │ ├── _pydantic.py
│ │ ├── _result.py
│ │ ├── _system_prompt.py
│ │ ├── _utils.py
│ │ ├── agent.py
│ │ ├── exceptions.py
│ │ ├── format_as_xml.py
│ │ ├── messages.py
│ │ ├── models
│ │ │ ├── __init__.py
│ │ │ ├── anthropic.py
│ │ │ ├── function.py
│ │ │ ├── gemini.py
│ │ │ ├── groq.py
│ │ │ ├── mistral.py
│ │ │ ├── ollama.py
│ │ │ ├── openai.py
│ │ │ ├── test.py
│ │ │ └── vertexai.py
│ │ ├── py.typed
│ │ ├── result.py
│ │ ├── settings.py
│ │ ├── tools.py
│ │ └── usage.py
│ └── pyproject.toml
├── pyproject.toml
├── requirements.txt
├── tests
│ ├── __init__.py
│ ├── conftest.py
│ ├── example_modules
│ │ ├── README.md
│ │ ├── bank_database.py
│ │ ├── fake_database.py
│ │ └── weather_service.py
│ ├── import_examples.py
│ ├── models
│ │ ├── __init__.py
│ │ ├── test_anthropic.py
│ │ ├── test_gemini.py
│ │ ├── test_groq.py
│ │ ├── test_mistral.py
│ │ ├── test_model.py
│ │ ├── test_model_function.py
│ │ ├── test_model_test.py
│ │ ├── test_ollama.py
│ │ ├── test_openai.py
│ │ └── test_vertexai.py
│ ├── test_agent.py
│ ├── test_deps.py
│ ├── test_examples.py
│ ├── test_format_as_xml.py
│ ├── test_live.py
│ ├── test_logfire.py
│ ├── test_streaming.py
│ ├── test_tools.py
│ ├── test_usage_limits.py
│ ├── test_utils.py
│ └── typed_agent.py
├── uprev.py
└── uv.lock
Documentation: ai.pydantic.dev
PydanticAI is a Python agent framework designed to make it less painful to build production grade applications with Generative AI.
FastAPI revolutionized web development by offering an innovative and ergonomic design, built on the foundation of Pydantic.
Similarly, virtually every agent framework and LLM library in Python uses Pydantic, yet when we began to use LLMs in Pydantic Logfire, we couldn't find anything that gave us the same feeling.
We built PydanticAI with one simple aim: to bring that FastAPI feeling to GenAI app development.
-
Built by the Pydantic Team Built by the team behind Pydantic (the validation layer of the OpenAI SDK, the Anthropic SDK, LangChain, LlamaIndex, AutoGPT, Transformers, CrewAI, Instructor and many more).
-
Model-agnostic Supports OpenAI, Anthropic, Gemini, Ollama, Groq, and Mistral, and there is a simple interface to implement support for other models.
-
Pydantic Logfire Integration Seamlessly integrates with Pydantic Logfire for real-time debugging, performance monitoring, and behavior tracking of your LLM-powered applications.
-
Type-safe Designed to make type checking as useful as possible for you, so it integrates well with static type checkers, like
mypy
andpyright
. -
Python-centric Design Leverages Python’s familiar control flow and agent composition to build your AI-driven projects, making it easy to apply standard Python best practices you'd use in any other (non-AI) project
-
Structured Responses Harnesses the power of Pydantic to validate and structure model outputs, ensuring responses are consistent across runs.
-
Dependency Injection System Offers an optional dependency injection system to provide data and services to your agent's system prompts, tools and result validators. This is useful for testing and eval-driven iterative development.
-
Streamed Responses Provides the ability to stream LLM outputs continuously, with immediate validation, ensuring rapid and accurate results.
PydanticAI is in early beta, the API is still subject to change and there's a lot more to do. Feedback is very welcome!
Here's a minimal example of PydanticAI:
from pydantic_ai import Agent
# Define a very simple agent including the model to use, you can also set the model when running the agent.
agent = Agent(
'gemini-1.5-flash',
# Register a static system prompt using a keyword argument to the agent.
# For more complex dynamically-generated system prompts, see the example below.
system_prompt='Be concise, reply with one sentence.',
)
# Run the agent synchronously, conducting a conversation with the LLM.
# Here the exchange should be very short: PydanticAI will send the system prompt and the user query to the LLM,
# the model will return a text response. See below for a more complex run.
result = agent.run_sync('Where does "hello world" come from?')
print(result.data)
"""
The first known use of "hello, world" was in a 1974 textbook about the C programming language.
"""
(This example is complete, it can be run "as is")
Not very interesting yet, but we can easily add "tools", dynamic system prompts, and structured responses to build more powerful agents.
Here is a concise example using PydanticAI to build a support agent for a bank:
(Better documented example in the docs)
from dataclasses import dataclass
from pydantic import BaseModel, Field
from pydantic_ai import Agent, RunContext
from bank_database import DatabaseConn
# SupportDependencies is used to pass data, connections, and logic into the model that will be needed when running
# system prompt and tool functions. Dependency injection provides a type-safe way to customise the behavior of your agents.
@dataclass
class SupportDependencies:
customer_id: int
db: DatabaseConn
# This pydantic model defines the structure of the result returned by the agent.
class SupportResult(BaseModel):
support_advice: str = Field(description='Advice returned to the customer')
block_card: bool = Field(description="Whether to block the customer's card")
risk: int = Field(description='Risk level of query', ge=0, le=10)
# This agent will act as first-tier support in a bank.
# Agents are generic in the type of dependencies they accept and the type of result they return.
# In this case, the support agent has type `Agent[SupportDependencies, SupportResult]`.
support_agent = Agent(
'openai:gpt-4o',
deps_type=SupportDependencies,
# The response from the agent will, be guaranteed to be a SupportResult,
# if validation fails the agent is prompted to try again.
result_type=SupportResult,
system_prompt=(
'You are a support agent in our bank, give the '
'customer support and judge the risk level of their query.'
),
)
# Dynamic system prompts can make use of dependency injection.
# Dependencies are carried via the `RunContext` argument, which is parameterized with the `deps_type` from above.
# If the type annotation here is wrong, static type checkers will catch it.
@support_agent.system_prompt
async def add_customer_name(ctx: RunContext[SupportDependencies]) -> str:
customer_name = await ctx.deps.db.customer_name(id=ctx.deps.customer_id)
return f"The customer's name is {customer_name!r}"
# `tool` let you register functions which the LLM may call while responding to a user.
# Again, dependencies are carried via `RunContext`, any other arguments become the tool schema passed to the LLM.
# Pydantic is used to validate these arguments, and errors are passed back to the LLM so it can retry.
@support_agent.tool
async def customer_balance(
ctx: RunContext[SupportDependencies], include_pending: bool
) -> float:
"""Returns the customer's current account balance."""
# The docstring of a tool is also passed to the LLM as the description of the tool.
# Parameter descriptions are extracted from the docstring and added to the parameter schema sent to the LLM.
balance = await ctx.deps.db.customer_balance(
id=ctx.deps.customer_id,
include_pending=include_pending,
)
return balance
... # In a real use case, you'd add more tools and a longer system prompt
async def main():
deps = SupportDependencies(customer_id=123, db=DatabaseConn())
# Run the agent asynchronously, conducting a conversation with the LLM until a final response is reached.
# Even in this fairly simple case, the agent will exchange multiple messages with the LLM as tools are called to retrieve a result.
result = await support_agent.run('What is my balance?', deps=deps)
# The result will be validated with Pydantic to guarantee it is a `SupportResult`, since the agent is generic,
# it'll also be typed as a `SupportResult` to aid with static type checking.
print(result.data)
"""
support_advice='Hello John, your current account balance, including pending transactions, is $123.45.' block_card=False risk=1
"""
result = await support_agent.run('I just lost my card!', deps=deps)
print(result.data)
"""
support_advice="I'm sorry to hear that, John. We are temporarily blocking your card to prevent unauthorized transactions." block_card=True risk=8
"""
To try PydanticAI yourself, follow the instructions in the examples.
Read the docs to learn more about building applications with PydanticAI.
Read the API Reference to understand PydanticAI's interface.
PydanticAI core logic with minimal required dependencies.
For more information on how to use this package see ai.pydantic.dev/install.
This directory is added to sys.path
in tests/test_examples.py::test_docs_examples
to augment some of the examples.
Examples of how to use PydanticAI and what it can do.
For full documentation of these examples and how to run them, see ai.pydantic.dev/examples/.
PydanticAI is available on PyPI as pydantic-ai
so installation is as simple as:
pip/uv-add pydantic-ai
(Requires Python 3.9+)
This installs the pydantic_ai
package, core dependencies, and libraries required to use all the models
included in PydanticAI. If you want to use a specific model, you can install the "slim" version of PydanticAI.
PydanticAI has an excellent (but completely optional) integration with Pydantic Logfire to help you view and understand agent runs.
To use Logfire with PydanticAI, install pydantic-ai
or pydantic-ai-slim
with the logfire
optional group:
pip/uv-add 'pydantic-ai[logfire]'
From there, follow the Logfire setup docs to configure Logfire.
We distribute the pydantic_ai_examples
directory as a separate PyPI package (pydantic-ai-examples
) to make examples extremely easy to customize and run.
To install examples, use the examples
optional group:
pip/uv-add 'pydantic-ai[examples]'
To run the examples, follow instructions in the examples docs.
If you know which model you're going to use and want to avoid installing superfluous packages, you can use the pydantic-ai-slim
package.
For example, if you're using just [OpenAIModel
][pydantic_ai.models.openai.OpenAIModel], you would run:
pip/uv-add 'pydantic-ai-slim[openai]'
See the models documentation for information on which optional dependencies are required for each model.
You can also install dependencies for multiple models and use cases, for example:
pip/uv-add 'pydantic-ai-slim[openai,vertexai,logfire]'
Function tools provide a mechanism for models to retrieve extra information to help them generate a response.
They're useful when it is impractical or impossible to put all the context an agent might need into the system prompt, or when you want to make agents' behavior more deterministic or reliable by deferring some of the logic required to generate a response to another (not necessarily AI-powered) tool.
!!! info "Function tools vs. RAG" Function tools are basically the "R" of RAG (Retrieval-Augmented Generation) — they augment what the model can do by letting it request extra information.
The main semantic difference between PydanticAI Tools and RAG is RAG is synonymous with vector search, while PydanticAI tools are more general-purpose. (Note: we may add support for vector search functionality in the future, particularly an API for generating embeddings. See [#58](https://github.com/pydantic/pydantic-ai/issues/58))
There are a number of ways to register tools with an agent:
- via the [
@agent.tool
][pydantic_ai.Agent.tool] decorator — for tools that need access to the agent [context][pydantic_ai.tools.RunContext] - via the [
@agent.tool_plain
][pydantic_ai.Agent.tool_plain] decorator — for tools that do not need access to the agent [context][pydantic_ai.tools.RunContext] - via the [
tools
][pydantic_ai.Agent.init] keyword argument toAgent
which can take either plain functions, or instances of [Tool
][pydantic_ai.tools.Tool]
@agent.tool
is considered the default decorator since in the majority of cases tools will need access to the agent context.
Here's an example using both:
import random
from pydantic_ai import Agent, RunContext
agent = Agent(
'gemini-1.5-flash', # (1)!
deps_type=str, # (2)!
system_prompt=(
"You're a dice game, you should roll the die and see if the number "
"you get back matches the user's guess. If so, tell them they're a winner. "
"Use the player's name in the response."
),
)
@agent.tool_plain # (3)!
def roll_die() -> str:
"""Roll a six-sided die and return the result."""
return str(random.randint(1, 6))
@agent.tool # (4)!
def get_player_name(ctx: RunContext[str]) -> str:
"""Get the player's name."""
return ctx.deps
dice_result = agent.run_sync('My guess is 4', deps='Anne') # (5)!
print(dice_result.data)
#> Congratulations Anne, you guessed correctly! You're a winner!
- This is a pretty simple task, so we can use the fast and cheap Gemini flash model.
- We pass the user's name as the dependency, to keep things simple we use just the name as a string as the dependency.
- This tool doesn't need any context, it just returns a random number. You could probably use a dynamic system prompt in this case.
- This tool needs the player's name, so it uses
RunContext
to access dependencies which are just the player's name in this case. - Run the agent, passing the player's name as the dependency.
(This example is complete, it can be run "as is")
Let's print the messages from that game to see what happened:
from dice_game import dice_result
print(dice_result.all_messages())
"""
[
ModelRequest(
parts=[
SystemPromptPart(
content="You're a dice game, you should roll the die and see if the number you get back matches the user's guess. If so, tell them they're a winner. Use the player's name in the response.",
part_kind='system-prompt',
),
UserPromptPart(
content='My guess is 4',
timestamp=datetime.datetime(...),
part_kind='user-prompt',
),
],
kind='request',
),
ModelResponse(
parts=[
ToolCallPart(
tool_name='roll_die',
args=ArgsDict(args_dict={}),
tool_call_id=None,
part_kind='tool-call',
)
],
timestamp=datetime.datetime(...),
kind='response',
),
ModelRequest(
parts=[
ToolReturnPart(
tool_name='roll_die',
content='4',
tool_call_id=None,
timestamp=datetime.datetime(...),
part_kind='tool-return',
)
],
kind='request',
),
ModelResponse(
parts=[
ToolCallPart(
tool_name='get_player_name',
args=ArgsDict(args_dict={}),
tool_call_id=None,
part_kind='tool-call',
)
],
timestamp=datetime.datetime(...),
kind='response',
),
ModelRequest(
parts=[
ToolReturnPart(
tool_name='get_player_name',
content='Anne',
tool_call_id=None,
timestamp=datetime.datetime(...),
part_kind='tool-return',
)
],
kind='request',
),
ModelResponse(
parts=[
TextPart(
content="Congratulations Anne, you guessed correctly! You're a winner!",
part_kind='text',
)
],
timestamp=datetime.datetime(...),
kind='response',
),
]
"""
We can represent this with a diagram:
sequenceDiagram
participant Agent
participant LLM
Note over Agent: Send prompts
Agent ->> LLM: System: "You're a dice game..."<br>User: "My guess is 4"
activate LLM
Note over LLM: LLM decides to use<br>a tool
LLM ->> Agent: Call tool<br>roll_die()
deactivate LLM
activate Agent
Note over Agent: Rolls a six-sided die
Agent -->> LLM: ToolReturn<br>"4"
deactivate Agent
activate LLM
Note over LLM: LLM decides to use<br>another tool
LLM ->> Agent: Call tool<br>get_player_name()
deactivate LLM
activate Agent
Note over Agent: Retrieves player name
Agent -->> LLM: ToolReturn<br>"Anne"
deactivate Agent
activate LLM
Note over LLM: LLM constructs final response
LLM ->> Agent: ModelResponse<br>"Congratulations Anne, ..."
deactivate LLM
Note over Agent: Game session complete
As well as using the decorators, we can register tools via the tools
argument to the [Agent
constructor][pydantic_ai.Agent.init]. This is useful when you want to re-use tools, and can also give more fine-grained control over the tools.
import random
from pydantic_ai import Agent, RunContext, Tool
def roll_die() -> str:
"""Roll a six-sided die and return the result."""
return str(random.randint(1, 6))
def get_player_name(ctx: RunContext[str]) -> str:
"""Get the player's name."""
return ctx.deps
agent_a = Agent(
'gemini-1.5-flash',
deps_type=str,
tools=[roll_die, get_player_name], # (1)!
)
agent_b = Agent(
'gemini-1.5-flash',
deps_type=str,
tools=[ # (2)!
Tool(roll_die, takes_ctx=False),
Tool(get_player_name, takes_ctx=True),
],
)
dice_result = agent_b.run_sync('My guess is 4', deps='Anne')
print(dice_result.data)
#> Congratulations Anne, you guessed correctly! You're a winner!
- The simplest way to register tools via the
Agent
constructor is to pass a list of functions, the function signature is inspected to determine if the tool takes [RunContext
][pydantic_ai.tools.RunContext]. agent_a
andagent_b
are identical — but we can use [Tool
][pydantic_ai.tools.Tool] to reuse tool definitions and give more fine-grained control over how tools are defined, e.g. setting their name or description, or using a customprepare
method.
(This example is complete, it can be run "as is")
As the name suggests, function tools use the model's "tools" or "functions" API to let the model know what is available to call. Tools or functions are also used to define the schema(s) for structured responses, thus a model might have access to many tools, some of which call function tools while others end the run and return a result.
Function parameters are extracted from the function signature, and all parameters except RunContext
are used to build the schema for that tool call.
Even better, PydanticAI extracts the docstring from functions and (thanks to griffe) extracts parameter descriptions from the docstring and adds them to the schema.
Griffe supports extracting parameter descriptions from google
, numpy
and sphinx
style docstrings, and PydanticAI will infer the format to use based on the docstring. We plan to add support in the future to explicitly set the style to use, and warn/error if not all parameters are documented; see #59.
To demonstrate a tool's schema, here we use [FunctionModel
][pydantic_ai.models.function.FunctionModel] to print the schema a model would receive:
from pydantic_ai import Agent
from pydantic_ai.messages import ModelMessage, ModelResponse
from pydantic_ai.models.function import AgentInfo, FunctionModel
agent = Agent()
@agent.tool_plain
def foobar(a: int, b: str, c: dict[str, list[float]]) -> str:
"""Get me foobar.
Args:
a: apple pie
b: banana cake
c: carrot smoothie
"""
return f'{a} {b} {c}'
def print_schema(messages: list[ModelMessage], info: AgentInfo) -> ModelResponse:
tool = info.function_tools[0]
print(tool.description)
#> Get me foobar.
print(tool.parameters_json_schema)
"""
{
'properties': {
'a': {'description': 'apple pie', 'title': 'A', 'type': 'integer'},
'b': {'description': 'banana cake', 'title': 'B', 'type': 'string'},
'c': {
'additionalProperties': {'items': {'type': 'number'}, 'type': 'array'},
'description': 'carrot smoothie',
'title': 'C',
'type': 'object',
},
},
'required': ['a', 'b', 'c'],
'type': 'object',
'additionalProperties': False,
}
"""
return ModelResponse.from_text(content='foobar')
agent.run_sync('hello', model=FunctionModel(print_schema))
(This example is complete, it can be run "as is")
The return type of tool can be anything which Pydantic can serialize to JSON as some models (e.g. Gemini) support semi-structured return values, some expect text (OpenAI) but seem to be just as good at extracting meaning from the data. If a Python object is returned and the model expects a string, the value will be serialized to JSON.
If a tool has a single parameter that can be represented as an object in JSON schema (e.g. dataclass, TypedDict, pydantic model), the schema for the tool is simplified to be just that object.
Here's an example, we use [TestModel.agent_model_function_tools
][pydantic_ai.models.test.TestModel.agent_model_function_tools] to inspect the tool schema that would be passed to the model.
from pydantic import BaseModel
from pydantic_ai import Agent
from pydantic_ai.models.test import TestModel
agent = Agent()
class Foobar(BaseModel):
"""This is a Foobar"""
x: int
y: str
z: float = 3.14
@agent.tool_plain
def foobar(f: Foobar) -> str:
return str(f)
test_model = TestModel()
result = agent.run_sync('hello', model=test_model)
print(result.data)
#> {"foobar":"x=0 y='a' z=3.14"}
print(test_model.agent_model_function_tools)
"""
[
ToolDefinition(
name='foobar',
description='This is a Foobar',
parameters_json_schema={
'properties': {
'x': {'title': 'X', 'type': 'integer'},
'y': {'title': 'Y', 'type': 'string'},
'z': {'default': 3.14, 'title': 'Z', 'type': 'number'},
},
'required': ['x', 'y'],
'title': 'Foobar',
'type': 'object',
},
outer_typed_dict_key=None,
)
]
"""
(This example is complete, it can be run "as is")
Tools can optionally be defined with another function: prepare
, which is called at each step of a run to
customize the definition of the tool passed to the model, or omit the tool completely from that step.
A prepare
method can be registered via the prepare
kwarg to any of the tool registration mechanisms:
- [
@agent.tool
][pydantic_ai.Agent.tool] decorator - [
@agent.tool_plain
][pydantic_ai.Agent.tool_plain] decorator - [
Tool
][pydantic_ai.tools.Tool] dataclass
The prepare
method, should be of type [ToolPrepareFunc
][pydantic_ai.tools.ToolPrepareFunc], a function which takes [RunContext
][pydantic_ai.tools.RunContext] and a pre-built [ToolDefinition
][pydantic_ai.tools.ToolDefinition], and should either return that ToolDefinition
with or without modifying it, return a new ToolDefinition
, or return None
to indicate this tools should not be registered for that step.
Here's a simple prepare
method that only includes the tool if the value of the dependency is 42
.
As with the previous example, we use [TestModel
][pydantic_ai.models.test.TestModel] to demonstrate the behavior without calling a real model.
from typing import Union
from pydantic_ai import Agent, RunContext
from pydantic_ai.tools import ToolDefinition
agent = Agent('test')
async def only_if_42(
ctx: RunContext[int], tool_def: ToolDefinition
) -> Union[ToolDefinition, None]:
if ctx.deps == 42:
return tool_def
@agent.tool(prepare=only_if_42)
def hitchhiker(ctx: RunContext[int], answer: str) -> str:
return f'{ctx.deps} {answer}'
result = agent.run_sync('testing...', deps=41)
print(result.data)
#> success (no tool calls)
result = agent.run_sync('testing...', deps=42)
print(result.data)
#> {"hitchhiker":"42 a"}
(This example is complete, it can be run "as is")
Here's a more complex example where we change the description of the name
parameter to based on the value of deps
For the sake of variation, we create this tool using the [Tool
][pydantic_ai.tools.Tool] dataclass.
from __future__ import annotations
from typing import Literal
from pydantic_ai import Agent, RunContext
from pydantic_ai.models.test import TestModel
from pydantic_ai.tools import Tool, ToolDefinition
def greet(name: str) -> str:
return f'hello {name}'
async def prepare_greet(
ctx: RunContext[Literal['human', 'machine']], tool_def: ToolDefinition
) -> ToolDefinition | None:
d = f'Name of the {ctx.deps} to greet.'
tool_def.parameters_json_schema['properties']['name']['description'] = d
return tool_def
greet_tool = Tool(greet, prepare=prepare_greet)
test_model = TestModel()
agent = Agent(test_model, tools=[greet_tool], deps_type=Literal['human', 'machine'])
result = agent.run_sync('testing...', deps='human')
print(result.data)
#> {"greet":"hello a"}
print(test_model.agent_model_function_tools)
"""
[
ToolDefinition(
name='greet',
description='',
parameters_json_schema={
'properties': {
'name': {
'title': 'Name',
'type': 'string',
'description': 'Name of the human to greet.',
}
},
'required': ['name'],
'type': 'object',
'additionalProperties': False,
},
outer_typed_dict_key=None,
)
]
"""
(This example is complete, it can be run "as is")
PydanticAI uses a dependency injection system to provide data and services to your agent's system prompts, tools and result validators.
Matching PydanticAI's design philosophy, our dependency system tries to use existing best practice in Python development rather than inventing esoteric "magic", this should make dependencies type-safe, understandable easier to test and ultimately easier to deploy in production.
Dependencies can be any python type. While in simple cases you might be able to pass a single object as a dependency (e.g. an HTTP connection), [dataclasses][] are generally a convenient container when your dependencies included multiple objects.
Here's an example of defining an agent that requires dependencies.
(Note: dependencies aren't actually used in this example, see Accessing Dependencies below)
from dataclasses import dataclass
import httpx
from pydantic_ai import Agent
@dataclass
class MyDeps: # (1)!
api_key: str
http_client: httpx.AsyncClient
agent = Agent(
'openai:gpt-4o',
deps_type=MyDeps, # (2)!
)
async def main():
async with httpx.AsyncClient() as client:
deps = MyDeps('foobar', client)
result = await agent.run(
'Tell me a joke.',
deps=deps, # (3)!
)
print(result.data)
#> Did you hear about the toothpaste scandal? They called it Colgate.
- Define a dataclass to hold dependencies.
- Pass the dataclass type to the
deps_type
argument of the [Agent
constructor][pydantic_ai.Agent.init]. Note: we're passing the type here, NOT an instance, this parameter is not actually used at runtime, it's here so we can get full type checking of the agent. - When running the agent, pass an instance of the dataclass to the
deps
parameter.
(This example is complete, it can be run "as is")
Dependencies are accessed through the [RunContext
][pydantic_ai.tools.RunContext] type, this should be the first parameter of system prompt functions etc.
from dataclasses import dataclass
import httpx
from pydantic_ai import Agent, RunContext
@dataclass
class MyDeps:
api_key: str
http_client: httpx.AsyncClient
agent = Agent(
'openai:gpt-4o',
deps_type=MyDeps,
)
@agent.system_prompt # (1)!
async def get_system_prompt(ctx: RunContext[MyDeps]) -> str: # (2)!
response = await ctx.deps.http_client.get( # (3)!
'https://example.com',
headers={'Authorization': f'Bearer {ctx.deps.api_key}'}, # (4)!
)
response.raise_for_status()
return f'Prompt: {response.text}'
async def main():
async with httpx.AsyncClient() as client:
deps = MyDeps('foobar', client)
result = await agent.run('Tell me a joke.', deps=deps)
print(result.data)
#> Did you hear about the toothpaste scandal? They called it Colgate.
- [
RunContext
][pydantic_ai.tools.RunContext] may optionally be passed to a [system_prompt
][pydantic_ai.Agent.system_prompt] function as the only argument. - [
RunContext
][pydantic_ai.tools.RunContext] is parameterized with the type of the dependencies, if this type is incorrect, static type checkers will raise an error. - Access dependencies through the [
.deps
][pydantic_ai.tools.RunContext.deps] attribute. - Access dependencies through the [
.deps
][pydantic_ai.tools.RunContext.deps] attribute.
(This example is complete, it can be run "as is")
System prompt functions, function tools and result validators are all run in the async context of an agent run.
If these functions are not coroutines (e.g. async def
) they are called with
[run_in_executor
][asyncio.loop.run_in_executor] in a thread pool, it's therefore marginally preferable
to use async
methods where dependencies perform IO, although synchronous dependencies should work fine too.
!!! note "run
vs. run_sync
and Asynchronous vs. Synchronous dependencies"
Whether you use synchronous or asynchronous dependencies, is completely independent of whether you use run
or run_sync
— run_sync
is just a wrapper around run
and agents are always run in an async context.
Here's the same example as above, but with a synchronous dependency:
from dataclasses import dataclass
import httpx
from pydantic_ai import Agent, RunContext
@dataclass
class MyDeps:
api_key: str
http_client: httpx.Client # (1)!
agent = Agent(
'openai:gpt-4o',
deps_type=MyDeps,
)
@agent.system_prompt
def get_system_prompt(ctx: RunContext[MyDeps]) -> str: # (2)!
response = ctx.deps.http_client.get(
'https://example.com', headers={'Authorization': f'Bearer {ctx.deps.api_key}'}
)
response.raise_for_status()
return f'Prompt: {response.text}'
async def main():
deps = MyDeps('foobar', httpx.Client())
result = await agent.run(
'Tell me a joke.',
deps=deps,
)
print(result.data)
#> Did you hear about the toothpaste scandal? They called it Colgate.
- Here we use a synchronous
httpx.Client
instead of an asynchronoushttpx.AsyncClient
. - To match the synchronous dependency, the system prompt function is now a plain function, not a coroutine.
(This example is complete, it can be run "as is")
As well as system prompts, dependencies can be used in tools and result validators.
from dataclasses import dataclass
import httpx
from pydantic_ai import Agent, ModelRetry, RunContext
@dataclass
class MyDeps:
api_key: str
http_client: httpx.AsyncClient
agent = Agent(
'openai:gpt-4o',
deps_type=MyDeps,
)
@agent.system_prompt
async def get_system_prompt(ctx: RunContext[MyDeps]) -> str:
response = await ctx.deps.http_client.get('https://example.com')
response.raise_for_status()
return f'Prompt: {response.text}'
@agent.tool # (1)!
async def get_joke_material(ctx: RunContext[MyDeps], subject: str) -> str:
response = await ctx.deps.http_client.get(
'https://example.com#jokes',
params={'subject': subject},
headers={'Authorization': f'Bearer {ctx.deps.api_key}'},
)
response.raise_for_status()
return response.text
@agent.result_validator # (2)!
async def validate_result(ctx: RunContext[MyDeps], final_response: str) -> str:
response = await ctx.deps.http_client.post(
'https://example.com#validate',
headers={'Authorization': f'Bearer {ctx.deps.api_key}'},
params={'query': final_response},
)
if response.status_code == 400:
raise ModelRetry(f'invalid response: {response.text}')
response.raise_for_status()
return final_response
async def main():
async with httpx.AsyncClient() as client:
deps = MyDeps('foobar', client)
result = await agent.run('Tell me a joke.', deps=deps)
print(result.data)
#> Did you hear about the toothpaste scandal? They called it Colgate.
- To pass
RunContext
to a tool, use the [tool
][pydantic_ai.Agent.tool] decorator. RunContext
may optionally be passed to a [result_validator
][pydantic_ai.Agent.result_validator] function as the first argument.
(This example is complete, it can be run "as is")
When testing agents, it's useful to be able to customise dependencies.
While this can sometimes be done by calling the agent directly within unit tests, we can also override dependencies while calling application code which in turn calls the agent.
This is done via the [override
][pydantic_ai.Agent.override] method on the agent.
from dataclasses import dataclass
import httpx
from pydantic_ai import Agent, RunContext
@dataclass
class MyDeps:
api_key: str
http_client: httpx.AsyncClient
async def system_prompt_factory(self) -> str: # (1)!
response = await self.http_client.get('https://example.com')
response.raise_for_status()
return f'Prompt: {response.text}'
joke_agent = Agent('openai:gpt-4o', deps_type=MyDeps)
@joke_agent.system_prompt
async def get_system_prompt(ctx: RunContext[MyDeps]) -> str:
return await ctx.deps.system_prompt_factory() # (2)!
async def application_code(prompt: str) -> str: # (3)!
...
...
# now deep within application code we call our agent
async with httpx.AsyncClient() as client:
app_deps = MyDeps('foobar', client)
result = await joke_agent.run(prompt, deps=app_deps) # (4)!
return result.data
- Define a method on the dependency to make the system prompt easier to customise.
- Call the system prompt factory from within the system prompt function.
- Application code that calls the agent, in a real application this might be an API endpoint.
- Call the agent from within the application code, in a real application this call might be deep within a call stack. Note
app_deps
here will NOT be used when deps are overridden.
(This example is complete, it can be run "as is")
from joke_app import MyDeps, application_code, joke_agent
class TestMyDeps(MyDeps): # (1)!
async def system_prompt_factory(self) -> str:
return 'test prompt'
async def test_application_code():
test_deps = TestMyDeps('test_key', None) # (2)!
with joke_agent.override(deps=test_deps): # (3)!
joke = await application_code('Tell me a joke.') # (4)!
assert joke.startswith('Did you hear about the toothpaste scandal?')
- Define a subclass of
MyDeps
in tests to customise the system prompt factory. - Create an instance of the test dependency, we don't need to pass an
http_client
here as it's not used. - Override the dependencies of the agent for the duration of the
with
block,test_deps
will be used when the agent is run. - Now we can safely call our application code, the agent will use the overridden dependencies.
The following examples demonstrate how to use dependencies in PydanticAI:
There are roughly four levels of complexity when building applications with PydanticAI:
- Single agent workflows — what most of the
pydantic_ai
documentation covers - Agent delegation — agents using another agent via tools
- Programmatic agent hand-off — one agent runs, then application code calls another agent
- Graph based control flow — for the most complex cases, a graph-based state machine can be used to control the execution of multiple agents
Of course, you can combine multiple strategies in a single application.
"Agent delegation" refers to the scenario where an agent delegates work to another agent, then takes back control when the delegate agent (the agent called from within a tool) finishes.
Since agents are stateless and designed to be global, you do not need to include the agent itself in agent dependencies.
You'll generally want to pass [ctx.usage
][pydantic_ai.RunContext.usage] to the [usage
][pydantic_ai.Agent.run] keyword argument of the delegate agent run so usage within that run counts towards the total usage of the parent agent run.
!!! note "Multiple models"
Agent delegation doesn't need to use the same model for each agent. If you choose to use different models within a run, calculating the monetary cost from the final [result.usage()
][pydantic_ai.result.RunResult.usage] of the run will not be possible, but you can still use [UsageLimits
][pydantic_ai.usage.UsageLimits] to avoid unexpected costs.
from pydantic_ai import Agent, RunContext
from pydantic_ai.usage import UsageLimits
joke_selection_agent = Agent( # (1)!
'openai:gpt-4o',
system_prompt=(
'Use the `joke_factory` to generate some jokes, then choose the best. '
'You must return just a single joke.'
),
)
joke_generation_agent = Agent('gemini-1.5-flash', result_type=list[str]) # (2)!
@joke_selection_agent.tool
async def joke_factory(ctx: RunContext[None], count: int) -> list[str]:
r = await joke_generation_agent.run( # (3)!
f'Please generate {count} jokes.',
usage=ctx.usage, # (4)!
)
return r.data # (5)!
result = joke_selection_agent.run_sync(
'Tell me a joke.',
usage_limits=UsageLimits(request_limit=5, total_tokens_limit=300),
)
print(result.data)
#> Did you hear about the toothpaste scandal? They called it Colgate.
print(result.usage())
"""
Usage(
requests=3, request_tokens=204, response_tokens=24, total_tokens=228, details=None
)
"""
- The "parent" or controlling agent.
- The "delegate" agent, which is called from within a tool of the parent agent.
- Call the delegate agent from within a tool of the parent agent.
- Pass the usage from the parent agent to the delegate agent so the final [
result.usage()
][pydantic_ai.result.RunResult.usage] includes the usage from both agents. - Since the function returns
#!python list[str]
, and theresult_type
ofjoke_generation_agent
is also#!python list[str]
, we can simply return#!python r.data
from the tool.
(This example is complete, it can be run "as is")
The control flow for this example is pretty simple and can be summarised as follows:
graph TD
START --> joke_selection_agent
joke_selection_agent --> joke_factory["joke_factory (tool)"]
joke_factory --> joke_generation_agent
joke_generation_agent --> joke_factory
joke_factory --> joke_selection_agent
joke_selection_agent --> END
Generally the delegate agent needs to either have the same dependencies as the calling agent, or dependencies which are a subset of the calling agent's dependencies.
!!! info "Initializing dependencies" We say "generally" above since there's nothing to stop you initializing dependencies within a tool call and therefore using interdependencies in a delegate agent that are not available on the parent, this should often be avoided since it can be significantly slower than reusing connections etc. from the parent agent.
from dataclasses import dataclass
import httpx
from pydantic_ai import Agent, RunContext
@dataclass
class ClientAndKey: # (1)!
http_client: httpx.AsyncClient
api_key: str
joke_selection_agent = Agent(
'openai:gpt-4o',
deps_type=ClientAndKey, # (2)!
system_prompt=(
'Use the `joke_factory` tool to generate some jokes on the given subject, '
'then choose the best. You must return just a single joke.'
),
)
joke_generation_agent = Agent(
'gemini-1.5-flash',
deps_type=ClientAndKey, # (4)!
result_type=list[str],
system_prompt=(
'Use the "get_jokes" tool to get some jokes on the given subject, '
'then extract each joke into a list.'
),
)
@joke_selection_agent.tool
async def joke_factory(ctx: RunContext[ClientAndKey], count: int) -> list[str]:
r = await joke_generation_agent.run(
f'Please generate {count} jokes.',
deps=ctx.deps, # (3)!
usage=ctx.usage,
)
return r.data
@joke_generation_agent.tool # (5)!
async def get_jokes(ctx: RunContext[ClientAndKey], count: int) -> str:
response = await ctx.deps.http_client.get(
'https://example.com',
params={'count': count},
headers={'Authorization': f'Bearer {ctx.deps.api_key}'},
)
response.raise_for_status()
return response.text
async def main():
async with httpx.AsyncClient() as client:
deps = ClientAndKey(client, 'foobar')
result = await joke_selection_agent.run('Tell me a joke.', deps=deps)
print(result.data)
#> Did you hear about the toothpaste scandal? They called it Colgate.
print(result.usage()) # (6)!
"""
Usage(
requests=4,
request_tokens=310,
response_tokens=32,
total_tokens=342,
details=None,
)
"""
- Define a dataclass to hold the client and API key dependencies.
- Set the
deps_type
of the calling agent —joke_selection_agent
here. - Pass the dependencies to the delegate agent's run method within the tool call.
- Also set the
deps_type
of the delegate agent —joke_generation_agent
here. - Define a tool on the delegate agent that uses the dependencies to make an HTTP request.
- Usage now includes 4 requests — 2 from the calling agent and 2 from the delegate agent.
(This example is complete, it can be run "as is")
This example shows how even a fairly simple agent delegation can lead to a complex control flow:
graph TD
START --> joke_selection_agent
joke_selection_agent --> joke_factory["joke_factory (tool)"]
joke_factory --> joke_generation_agent
joke_generation_agent --> get_jokes["get_jokes (tool)"]
get_jokes --> http_request["HTTP request"]
http_request --> get_jokes
get_jokes --> joke_generation_agent
joke_generation_agent --> joke_factory
joke_factory --> joke_selection_agent
joke_selection_agent --> END
"Programmatic agent hand-off" refers to the scenario where multiple agents are called in succession, with application code and/or a human in the loop responsible for deciding which agent to call next.
Here agents don't need to use the same deps.
Here we show two agents used in succession, the first to find a flight and the second to extract the user's seat preference.
from typing import Literal, Union
from pydantic import BaseModel, Field
from rich.prompt import Prompt
from pydantic_ai import Agent, RunContext
from pydantic_ai.messages import ModelMessage
from pydantic_ai.usage import Usage, UsageLimits
class FlightDetails(BaseModel):
flight_number: str
class Failed(BaseModel):
"""Unable to find a satisfactory choice."""
flight_search_agent = Agent[None, Union[FlightDetails, Failed]]( # (1)!
'openai:gpt-4o',
result_type=Union[FlightDetails, Failed], # type: ignore
system_prompt=(
'Use the "flight_search" tool to find a flight '
'from the given origin to the given destination.'
),
)
@flight_search_agent.tool # (2)!
async def flight_search(
ctx: RunContext[None], origin: str, destination: str
) -> Union[FlightDetails, None]:
# in reality, this would call a flight search API or
# use a browser to scrape a flight search website
return FlightDetails(flight_number='AK456')
usage_limits = UsageLimits(request_limit=15) # (3)!
async def find_flight(usage: Usage) -> Union[FlightDetails, None]: # (4)!
message_history: Union[list[ModelMessage], None] = None
for _ in range(3):
prompt = Prompt.ask(
'Where would you like to fly from and to?',
)
result = await flight_search_agent.run(
prompt,
message_history=message_history,
usage=usage,
usage_limits=usage_limits,
)
if isinstance(result.data, FlightDetails):
return result.data
else:
message_history = result.all_messages(
result_tool_return_content='Please try again.'
)
class SeatPreference(BaseModel):
row: int = Field(ge=1, le=30)
seat: Literal['A', 'B', 'C', 'D', 'E', 'F']
# This agent is responsible for extracting the user's seat selection
seat_preference_agent = Agent[None, Union[SeatPreference, Failed]]( # (5)!
'openai:gpt-4o',
result_type=Union[SeatPreference, Failed], # type: ignore
system_prompt=(
"Extract the user's seat preference. "
'Seats A and F are window seats. '
'Row 1 is the front row and has extra leg room. '
'Rows 14, and 20 also have extra leg room. '
),
)
async def find_seat(usage: Usage) -> SeatPreference: # (6)!
message_history: Union[list[ModelMessage], None] = None
while True:
answer = Prompt.ask('What seat would you like?')
result = await seat_preference_agent.run(
answer,
message_history=message_history,
usage=usage,
usage_limits=usage_limits,
)
if isinstance(result.data, SeatPreference):
return result.data
else:
print('Could not understand seat preference. Please try again.')
message_history = result.all_messages()
async def main(): # (7)!
usage: Usage = Usage()
opt_flight_details = await find_flight(usage)
if opt_flight_details is not None:
print(f'Flight found: {opt_flight_details.flight_number}')
#> Flight found: AK456
seat_preference = await find_seat(usage)
print(f'Seat preference: {seat_preference}')
#> Seat preference: row=1 seat='A'
- Define the first agent, which finds a flight. We use an explicit type annotation until PEP-747 lands, see structured results. We use a union as the result type so the model can communicate if it's unable to find a satisfactory choice; internally, each member of the union will be registered as a separate tool.
- Define a tool on the agent to find a flight. In this simple case we could dispense with the tool and just define the agent to return structured data, then search for a flight, but in more complex scenarios the tool would be necessary.
- Define usage limits for the entire app.
- Define a function to find a flight, which asks the user for their preferences and then calls the agent to find a flight.
- As with
flight_search_agent
above, we use an explicit type annotation to define the agent. - Define a function to find the user's seat preference, which asks the user for their seat preference and then calls the agent to extract the seat preference.
- Now that we've put our logic for running each agent into separate functions, our main app becomes very simple.
(This example is complete, it can be run "as is")
The control flow for this example can be summarised as follows:
graph TB
START --> ask_user_flight["ask user for flight"]
subgraph find_flight
flight_search_agent --> ask_user_flight
ask_user_flight --> flight_search_agent
end
flight_search_agent --> ask_user_seat["ask user for seat"]
flight_search_agent --> END
subgraph find_seat
seat_preference_agent --> ask_user_seat
ask_user_seat --> seat_preference_agent
end
seat_preference_agent --> END
!!! example "Work in progress" This is a work in progress and not yet documented, see #528 and #539
The following examples demonstrate how to use dependencies in PydanticAI:
If you need help getting started with PydanticAI or with advanced usage, the following sources may be useful.
Join the #pydantic-ai
channel in the Pydantic Slack to ask questions, get help, and chat about PydanticAI. There's also channels for Pydantic, Logfire, and FastUI.
If you're on a Logfire Pro plan, you can also get a dedicated private slack collab channel with us.
The PydanticAI GitHub Issues are a great place to ask questions and give us feedback.
We'd love you to contribute to PydanticAI!
Clone your fork and cd into the repo directory
git clone [email protected]:<your username>/pydantic-ai.git
cd pydantic-ai
Install uv
(version 0.4.30 or later) and pre-commit
We use pipx here, for other options see:
To get pipx
itself, see these docs
pipx install uv pre-commit
Install pydantic-ai
, all dependencies and pre-commit hooks
make install
We use make
to manage most commands you'll need to run.
For details on available commands, run:
make help
To run code formatting, linting, static type checks, and tests with coverage report generation, run:
make
To run the documentation page locally, run:
uv run mkdocs serve
To avoid an excessive workload for the maintainers of PydanticAI, we can't accept all model contributions, so we're setting the following rules for when we'll accept new models and when we won't. This should hopefully reduce the chances of disappointment and wasted work.
- To add a new model with an extra dependency, that dependency needs > 500k monthly downloads from PyPI consistently over 3 months or more
- To add a new model which uses another models logic internally and has no extra dependencies, that model's GitHub org needs > 20k stars in total
- For any other model that's just a custom URL and API key, we're happy to add a one-paragraph description with a link and instructions on the URL to use
- For any other model that requires more logic, we recommend you release your own Python package
pydantic-ai-xxx
, which depends onpydantic-ai-slim
and implements a model that inherits from our [Model
][pydantic_ai.models.Model] ABC
If you're unsure about adding a model, please create an issue.
PydanticAI is Model-agnostic and has built in support for the following model providers:
- OpenAI
- Anthropic
- Gemini via two different APIs: Generative Language API and VertexAI API
- Ollama
- Groq
- Mistral
You can also add support for other models.
PydanticAI also comes with TestModel
and FunctionModel
for testing and development.
To use each model provider, you need to configure your local environment and make sure you have the right packages installed.
To use OpenAI models, you need to either install pydantic-ai
, or install pydantic-ai-slim
with the openai
optional group:
pip/uv-add 'pydantic-ai-slim[openai]'
To use [OpenAIModel
][pydantic_ai.models.openai.OpenAIModel] through their main API, go to platform.openai.com and follow your nose until you find the place to generate an API key.
Once you have the API key, you can set it as an environment variable:
export OPENAI_API_KEY='your-api-key'
You can then use [OpenAIModel
][pydantic_ai.models.openai.OpenAIModel] by name:
from pydantic_ai import Agent
agent = Agent('openai:gpt-4o')
...
Or initialise the model directly with just the model name:
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIModel
model = OpenAIModel('gpt-4o')
agent = Agent(model)
...
If you don't want to or can't set the environment variable, you can pass it at runtime via the [api_key
argument][pydantic_ai.models.openai.OpenAIModel.init]:
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIModel
model = OpenAIModel('gpt-4o', api_key='your-api-key')
agent = Agent(model)
...
To use another OpenAI-compatible API, such as OpenRouter, you can make use of the [base_url
argument][pydantic_ai.models.openai.OpenAIModel.init]:
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIModel
model = OpenAIModel(
'anthropic/claude-3.5-sonnet',
base_url='https://openrouter.ai/api/v1',
api_key='your-api-key',
)
agent = Agent(model)
...
OpenAIModel
also accepts a custom AsyncOpenAI
client via the [openai_client
parameter][pydantic_ai.models.openai.OpenAIModel.init],
so you can customise the organization
, project
, base_url
etc. as defined in the OpenAI API docs.
You could also use the AsyncAzureOpenAI
client to use the Azure OpenAI API.
from openai import AsyncAzureOpenAI
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIModel
client = AsyncAzureOpenAI(
azure_endpoint='...',
api_version='2024-07-01-preview',
api_key='your-api-key',
)
model = OpenAIModel('gpt-4o', openai_client=client)
agent = Agent(model)
...
To use [AnthropicModel
][pydantic_ai.models.anthropic.AnthropicModel] models, you need to either install pydantic-ai
, or install pydantic-ai-slim
with the anthropic
optional group:
pip/uv-add 'pydantic-ai-slim[anthropic]'
To use Anthropic through their API, go to console.anthropic.com/settings/keys to generate an API key.
[AnthropicModelName
][pydantic_ai.models.anthropic.AnthropicModelName] contains a list of available Anthropic models.
Once you have the API key, you can set it as an environment variable:
export ANTHROPIC_API_KEY='your-api-key'
You can then use [AnthropicModel
][pydantic_ai.models.anthropic.AnthropicModel] by name:
from pydantic_ai import Agent
agent = Agent('claude-3-5-sonnet-latest')
...
Or initialise the model directly with just the model name:
from pydantic_ai import Agent
from pydantic_ai.models.anthropic import AnthropicModel
model = AnthropicModel('claude-3-5-sonnet-latest')
agent = Agent(model)
...
If you don't want to or can't set the environment variable, you can pass it at runtime via the [api_key
argument][pydantic_ai.models.anthropic.AnthropicModel.init]:
from pydantic_ai import Agent
from pydantic_ai.models.anthropic import AnthropicModel
model = AnthropicModel('claude-3-5-sonnet-latest', api_key='your-api-key')
agent = Agent(model)
...
!!! warning "For prototyping only" Google themselves refer to this API as the "hobby" API, I've received 503 responses from it a number of times. The API is easy to use and useful for prototyping and simple demos, but I would not rely on it in production.
If you want to run Gemini models in production, you should use the [VertexAI API](#gemini-via-vertexai) described below.
To use [GeminiModel
][pydantic_ai.models.gemini.GeminiModel] models, you just need to install pydantic-ai
or pydantic-ai-slim
, no extra dependencies are required.
[GeminiModel
][pydantic_ai.models.gemini.GeminiModel] let's you use the Google's Gemini models through their Generative Language API, generativelanguage.googleapis.com
.
[GeminiModelName
][pydantic_ai.models.gemini.GeminiModelName] contains a list of available Gemini models that can be used through this interface.
To use GeminiModel
, go to aistudio.google.com and follow your nose until you find the place to generate an API key.
Once you have the API key, you can set it as an environment variable:
export GEMINI_API_KEY=your-api-key
You can then use [GeminiModel
][pydantic_ai.models.gemini.GeminiModel] by name:
from pydantic_ai import Agent
agent = Agent('gemini-1.5-flash')
...
Or initialise the model directly with just the model name:
from pydantic_ai import Agent
from pydantic_ai.models.gemini import GeminiModel
model = GeminiModel('gemini-1.5-flash')
agent = Agent(model)
...
If you don't want to or can't set the environment variable, you can pass it at runtime via the [api_key
argument][pydantic_ai.models.gemini.GeminiModel.init]:
from pydantic_ai import Agent
from pydantic_ai.models.gemini import GeminiModel
model = GeminiModel('gemini-1.5-flash', api_key='your-api-key')
agent = Agent(model)
...
To run Google's Gemini models in production, you should use [VertexAIModel
][pydantic_ai.models.vertexai.VertexAIModel] which uses the *-aiplatform.googleapis.com
API.
[GeminiModelName
][pydantic_ai.models.gemini.GeminiModelName] contains a list of available Gemini models that can be used through this interface.
To use [VertexAIModel
][pydantic_ai.models.vertexai.VertexAIModel], you need to either install pydantic-ai
, or install pydantic-ai-slim
with the vertexai
optional group:
pip/uv-add 'pydantic-ai-slim[vertexai]'
This interface has a number of advantages over generativelanguage.googleapis.com
documented above:
- The VertexAI API is more reliably and marginally lower latency in our experience.
- You can purchase provisioned throughput with VertexAI to guarantee capacity.
- If you're running PydanticAI inside GCP, you don't need to set up authentication, it should "just work".
- You can decide which region to use, which might be important from a regulatory perspective, and might improve latency.
The big disadvantage is that for local development you may need to create and configure a "service account", which I've found extremely painful to get right in the past.
Whichever way you authenticate, you'll need to have VertexAI enabled in your GCP account.
Luckily if you're running PydanticAI inside GCP, or you have the gcloud
CLI installed and configured, you should be able to use VertexAIModel
without any additional setup.
To use VertexAIModel
, with application default credentials configured (e.g. with gcloud
), you can simply use:
from pydantic_ai import Agent
from pydantic_ai.models.vertexai import VertexAIModel
model = VertexAIModel('gemini-1.5-flash')
agent = Agent(model)
...
Internally this uses google.auth.default()
from the google-auth
package to obtain credentials.
!!! note "Won't fail until agent.run()
"
Because google.auth.default()
requires network requests and can be slow, it's not run until you call agent.run()
. Meaning any configuration or permissions error will only be raised when you try to use the model. To for this check to be run, call [await model.ainit()
][pydantic_ai.models.vertexai.VertexAIModel.ainit].
You may also need to pass the [project_id
argument to VertexAIModel
][pydantic_ai.models.vertexai.VertexAIModel.init] if application default credentials don't set a project, if you pass project_id
and it conflicts with the project set by application default credentials, an error is raised.
If instead of application default credentials, you want to authenticate with a service account, you'll need to create a service account, add it to your GCP project (note: AFAIK this step is necessary even if you created the service account within the project), give that service account the "Vertex AI Service Agent" role, and download the service account JSON file.
Once you have the JSON file, you can use it thus:
from pydantic_ai import Agent
from pydantic_ai.models.vertexai import VertexAIModel
model = VertexAIModel(
'gemini-1.5-flash',
service_account_file='path/to/service-account.json',
)
agent = Agent(model)
...
Whichever way you authenticate, you can specify which region requests will be sent to via the [region
argument][pydantic_ai.models.vertexai.VertexAIModel.init].
Using a region close to your application can improve latency and might be important from a regulatory perspective.
from pydantic_ai import Agent
from pydantic_ai.models.vertexai import VertexAIModel
model = VertexAIModel('gemini-1.5-flash', region='asia-east1')
agent = Agent(model)
...
[VertexAiRegion
][pydantic_ai.models.vertexai.VertexAiRegion] contains a list of available regions.
To use [OllamaModel
][pydantic_ai.models.ollama.OllamaModel], you need to either install pydantic-ai
, or install pydantic-ai-slim
with the openai
optional group:
pip/uv-add 'pydantic-ai-slim[openai]'
This is because internally, OllamaModel
uses the OpenAI API.
To use Ollama, you must first download the Ollama client, and then download a model using the Ollama model library.
You must also ensure the Ollama server is running when trying to make requests to it. For more information, please see the Ollama documentation.
For detailed setup and example, please see the Ollama setup documentation.
To use [GroqModel
][pydantic_ai.models.groq.GroqModel], you need to either install pydantic-ai
, or install pydantic-ai-slim
with the groq
optional group:
pip/uv-add 'pydantic-ai-slim[groq]'
To use Groq through their API, go to console.groq.com/keys and follow your nose until you find the place to generate an API key.
[GroqModelName
][pydantic_ai.models.groq.GroqModelName] contains a list of available Groq models.
Once you have the API key, you can set it as an environment variable:
export GROQ_API_KEY='your-api-key'
You can then use [GroqModel
][pydantic_ai.models.groq.GroqModel] by name:
from pydantic_ai import Agent
agent = Agent('groq:llama-3.1-70b-versatile')
...
Or initialise the model directly with just the model name:
from pydantic_ai import Agent
from pydantic_ai.models.groq import GroqModel
model = GroqModel('llama-3.1-70b-versatile')
agent = Agent(model)
...
If you don't want to or can't set the environment variable, you can pass it at runtime via the [api_key
argument][pydantic_ai.models.groq.GroqModel.init]:
from pydantic_ai import Agent
from pydantic_ai.models.groq import GroqModel
model = GroqModel('llama-3.1-70b-versatile', api_key='your-api-key')
agent = Agent(model)
...
To use [MistralModel
][pydantic_ai.models.mistral.MistralModel], you need to either install pydantic-ai
, or install pydantic-ai-slim
with the mistral
optional group:
pip/uv-add 'pydantic-ai-slim[mistral]'
To use Mistral through their API, go to console.mistral.ai/api-keys/ and follow your nose until you find the place to generate an API key.
[NamedMistralModels
][pydantic_ai.models.mistral.NamedMistralModels] contains a list of the most popular Mistral models.
Once you have the API key, you can set it as an environment variable:
export MISTRAL_API_KEY='your-api-key'
You can then use [MistralModel
][pydantic_ai.models.mistral.MistralModel] by name:
from pydantic_ai import Agent
agent = Agent('mistral:mistral-large-latest')
...
Or initialise the model directly with just the model name:
from pydantic_ai import Agent
from pydantic_ai.models.mistral import MistralModel
model = MistralModel('mistral-small-latest')
agent = Agent(model)
...
If you don't want to or can't set the environment variable, you can pass it at runtime via the [api_key
argument][pydantic_ai.models.mistral.MistralModel.init]:
from pydantic_ai import Agent
from pydantic_ai.models.mistral import MistralModel
model = MistralModel('mistral-small-latest', api_key='your-api-key')
agent = Agent(model)
...
To implement support for models not already supported, you will need to subclass the [Model
][pydantic_ai.models.Model] abstract base class.
This in turn will require you to implement the following other abstract base classes:
- [
AgentModel
][pydantic_ai.models.AgentModel] - [
StreamTextResponse
][pydantic_ai.models.StreamTextResponse] - [
StreamStructuredResponse
][pydantic_ai.models.StreamStructuredResponse]
The best place to start is to review the source code for existing implementations, e.g. OpenAIModel
.
For details on when we'll accept contributions adding new models to PydanticAI, see the contributing guidelines.
--8<-- "docs/.partials/index-header.html"
PydanticAI is a Python Agent Framework designed to make it less painful to build production grade applications with Generative AI.
FastAPI revolutionized web development by offering an innovative and ergonomic design, built on the foundation of Pydantic.
Similarly, virtually every agent framework and LLM library in Python uses Pydantic, yet when we began to use LLMs in Pydantic Logfire, we couldn't find anything that gave us the same feeling.
We built PydanticAI with one simple aim: to bring that FastAPI feeling to GenAI app development.
:material-account-group:{ .md .middle .team-blue } Built by the Pydantic Team
Built by the team behind Pydantic (the validation layer of the OpenAI SDK, the Anthropic SDK, LangChain, LlamaIndex, AutoGPT, Transformers, CrewAI, Instructor and many more).
:fontawesome-solid-shapes:{ .md .middle .shapes-orange } Model-agnostic
Supports OpenAI, Anthropic, Gemini, Ollama, Groq, and Mistral, and there is a simple interface to implement support for other models.
:logfire-logo:{ .md .middle } Pydantic Logfire Integration
Seamlessly integrates with Pydantic Logfire for real-time debugging, performance monitoring, and behavior tracking of your LLM-powered applications.
:material-shield-check:{ .md .middle .secure-green } Type-safe
Designed to make type checking as useful as possible for you, so it integrates well with static type checkers, like mypy
and pyright
.
🐍{ .md .middle } Python-centric Design
Leverages Python’s familiar control flow and agent composition to build your AI-driven projects, making it easy to apply standard Python best practices you'd use in any other (non-AI) project
:simple-pydantic:{ .md .middle .pydantic-pink } Structured Responses
Harnesses the power of Pydantic to validate and structure model outputs, ensuring responses are consistent across runs.
:material-puzzle-plus:{ .md .middle .puzzle-purple } Dependency Injection System
Offers an optional dependency injection system to provide data and services to your agent's system prompts, tools and result validators.
This is useful for testing and eval-driven iterative development.
:material-sine-wave:{ .md .middle } Streamed Responses
Provides the ability to stream LLM outputs continuously, with immediate validation, ensuring rapid and accurate results.
!!! example "In Beta" PydanticAI is in early beta, the API is still subject to change and there's a lot more to do. Feedback is very welcome!
Here's a minimal example of PydanticAI:
from pydantic_ai import Agent
agent = Agent( # (1)!
'gemini-1.5-flash',
system_prompt='Be concise, reply with one sentence.', # (2)!
)
result = agent.run_sync('Where does "hello world" come from?') # (3)!
print(result.data)
"""
The first known use of "hello, world" was in a 1974 textbook about the C programming language.
"""
- We configure the agent to use Gemini 1.5's Flash model, but you can also set the model when running the agent.
- Register a static system prompt using a keyword argument to the agent.
- Run the agent synchronously, conducting a conversation with the LLM.
(This example is complete, it can be run "as is")
The exchange should be very short: PydanticAI will send the system prompt and the user query to the LLM, the model will return a text response.
Not very interesting yet, but we can easily add "tools", dynamic system prompts, and structured responses to build more powerful agents.
Here is a concise example using PydanticAI to build a support agent for a bank:
from dataclasses import dataclass
from pydantic import BaseModel, Field
from pydantic_ai import Agent, RunContext
from bank_database import DatabaseConn
@dataclass
class SupportDependencies: # (3)!
customer_id: int
db: DatabaseConn # (12)!
class SupportResult(BaseModel): # (13)!
support_advice: str = Field(description='Advice returned to the customer')
block_card: bool = Field(description="Whether to block the customer's card")
risk: int = Field(description='Risk level of query', ge=0, le=10)
support_agent = Agent( # (1)!
'openai:gpt-4o', # (2)!
deps_type=SupportDependencies,
result_type=SupportResult, # (9)!
system_prompt=( # (4)!
'You are a support agent in our bank, give the '
'customer support and judge the risk level of their query.'
),
)
@support_agent.system_prompt # (5)!
async def add_customer_name(ctx: RunContext[SupportDependencies]) -> str:
customer_name = await ctx.deps.db.customer_name(id=ctx.deps.customer_id)
return f"The customer's name is {customer_name!r}"
@support_agent.tool # (6)!
async def customer_balance(
ctx: RunContext[SupportDependencies], include_pending: bool
) -> float:
"""Returns the customer's current account balance.""" # (7)!
return await ctx.deps.db.customer_balance(
id=ctx.deps.customer_id,
include_pending=include_pending,
)
... # (11)!
async def main():
deps = SupportDependencies(customer_id=123, db=DatabaseConn())
result = await support_agent.run('What is my balance?', deps=deps) # (8)!
print(result.data) # (10)!
"""
support_advice='Hello John, your current account balance, including pending transactions, is $123.45.' block_card=False risk=1
"""
result = await support_agent.run('I just lost my card!', deps=deps)
print(result.data)
"""
support_advice="I'm sorry to hear that, John. We are temporarily blocking your card to prevent unauthorized transactions." block_card=True risk=8
"""
- This agent will act as first-tier support in a bank. Agents are generic in the type of dependencies they accept and the type of result they return. In this case, the support agent has type
#!python Agent[SupportDependencies, SupportResult]
. - Here we configure the agent to use OpenAI's GPT-4o model, you can also set the model when running the agent.
- The
SupportDependencies
dataclass is used to pass data, connections, and logic into the model that will be needed when running system prompt and tool functions. PydanticAI's system of dependency injection provides a type-safe way to customise the behavior of your agents, and can be especially useful when running unit tests and evals. - Static system prompts can be registered with the [
system_prompt
keyword argument][pydantic_ai.Agent.init] to the agent. - Dynamic system prompts can be registered with the [
@agent.system_prompt
][pydantic_ai.Agent.system_prompt] decorator, and can make use of dependency injection. Dependencies are carried via the [RunContext
][pydantic_ai.tools.RunContext] argument, which is parameterized with thedeps_type
from above. If the type annotation here is wrong, static type checkers will catch it. tool
let you register functions which the LLM may call while responding to a user. Again, dependencies are carried via [RunContext
][pydantic_ai.tools.RunContext], any other arguments become the tool schema passed to the LLM. Pydantic is used to validate these arguments, and errors are passed back to the LLM so it can retry.- The docstring of a tool is also passed to the LLM as the description of the tool. Parameter descriptions are extracted from the docstring and added to the parameter schema sent to the LLM.
- Run the agent asynchronously, conducting a conversation with the LLM until a final response is reached. Even in this fairly simple case, the agent will exchange multiple messages with the LLM as tools are called to retrieve a result.
- The response from the agent will, be guaranteed to be a
SupportResult
, if validation fails reflection will mean the agent is prompted to try again. - The result will be validated with Pydantic to guarantee it is a
SupportResult
, since the agent is generic, it'll also be typed as aSupportResult
to aid with static type checking. - In a real use case, you'd add more tools and a longer system prompt to the agent to extend the context it's equipped with and support it can provide.
- This is a simple sketch of a database connection, used to keep the example short and readable. In reality, you'd be connecting to an external database (e.g. PostgreSQL) to get information about customers.
- This Pydantic model is used to constrain the structured data returned by the agent. From this simple definition, Pydantic builds the JSON Schema that tells the LLM how to return the data, and performs validation to guarantee the data is correct at the end of the run.
!!! tip "Complete bank_support.py
example"
The code included here is incomplete for the sake of brevity (the definition of DatabaseConn
is missing); you can find the complete bank_support.py
example here.
To understand the flow of the above runs, we can watch the agent in action using Pydantic Logfire.
To do this, we need to set up logfire, and add the following to our code:
...
from bank_database import DatabaseConn
import logfire
logfire.configure() # (1)!
logfire.instrument_asyncpg() # (2)!
...
- Configure logfire, this will fail if not project is set up.
- In our demo,
DatabaseConn
usesasyncpg
to connect to a PostgreSQL database, sologfire.instrument_asyncpg()
is used to log the database queries.
That's enough to get the following view of your agent in action:
{{ video('9078b98c4f75d01f912a0368bbbdb97a', 25, 55) }}
See Monitoring and Performance to learn more.
To try PydanticAI yourself, follow the instructions in the examples.
Read the docs to learn more about building applications with PydanticAI.
Read the API Reference to understand PydanticAI's interface.
Results are the final values returned from running an agent.
The result values are wrapped in [RunResult
][pydantic_ai.result.RunResult] and [StreamedRunResult
][pydantic_ai.result.StreamedRunResult] so you can access other data like [usage][pydantic_ai.result.Usage] of the run and message history
Both RunResult
and StreamedRunResult
are generic in the data they wrap, so typing information about the data returned by the agent is preserved.
from pydantic import BaseModel
from pydantic_ai import Agent
class CityLocation(BaseModel):
city: str
country: str
agent = Agent('gemini-1.5-flash', result_type=CityLocation)
result = agent.run_sync('Where were the olympics held in 2012?')
print(result.data)
#> city='London' country='United Kingdom'
print(result.usage())
"""
Usage(requests=1, request_tokens=57, response_tokens=8, total_tokens=65, details=None)
"""
(This example is complete, it can be run "as is")
Runs end when either a plain text response is received or the model calls a tool associated with one of the structured result types. We will add limits to make sure a run doesn't go on indefinitely, see #70.
When the result type is str
, or a union including str
, plain text responses are enabled on the model, and the raw text response from the model is used as the response data.
If the result type is a union with multiple members (after remove str
from the members), each member is registered as a separate tool with the model in order to reduce the complexity of the tool schemas and maximise the changes a model will respond correctly.
If the result type schema is not of type "object"
, the result type is wrapped in a single element object, so the schema of all tools registered with the model are object schemas.
Structured results (like tools) use Pydantic to build the JSON schema used for the tool, and to validate the data returned by the model.
!!! note "Bring on PEP-747"
Until PEP-747 "Annotating Type Forms" lands, unions are not valid as type
s in Python.
When creating the agent we need to `# type: ignore` the `result_type` argument, and add a type hint to tell type checkers about the type of the agent.
Here's an example of returning either text or a structured value
from typing import Union
from pydantic import BaseModel
from pydantic_ai import Agent
class Box(BaseModel):
width: int
height: int
depth: int
units: str
agent: Agent[None, Union[Box, str]] = Agent(
'openai:gpt-4o-mini',
result_type=Union[Box, str], # type: ignore
system_prompt=(
"Extract me the dimensions of a box, "
"if you can't extract all data, ask the user to try again."
),
)
result = agent.run_sync('The box is 10x20x30')
print(result.data)
#> Please provide the units for the dimensions (e.g., cm, in, m).
result = agent.run_sync('The box is 10x20x30 cm')
print(result.data)
#> width=10 height=20 depth=30 units='cm'
(This example is complete, it can be run "as is")
Here's an example of using a union return type which registered multiple tools, and wraps non-object schemas in an object:
from typing import Union
from pydantic_ai import Agent
agent: Agent[None, Union[list[str], list[int]]] = Agent(
'openai:gpt-4o-mini',
result_type=Union[list[str], list[int]], # type: ignore
system_prompt='Extract either colors or sizes from the shapes provided.',
)
result = agent.run_sync('red square, blue circle, green triangle')
print(result.data)
#> ['red', 'blue', 'green']
result = agent.run_sync('square size 10, circle size 20, triangle size 30')
print(result.data)
#> [10, 20, 30]
(This example is complete, it can be run "as is")
Some validation is inconvenient or impossible to do in Pydantic validators, in particular when the validation requires IO and is asynchronous. PydanticAI provides a way to add validation functions via the [agent.result_validator
][pydantic_ai.Agent.result_validator] decorator.
Here's a simplified variant of the SQL Generation example:
from typing import Union
from fake_database import DatabaseConn, QueryError
from pydantic import BaseModel
from pydantic_ai import Agent, RunContext, ModelRetry
class Success(BaseModel):
sql_query: str
class InvalidRequest(BaseModel):
error_message: str
Response = Union[Success, InvalidRequest]
agent: Agent[DatabaseConn, Response] = Agent(
'gemini-1.5-flash',
result_type=Response, # type: ignore
deps_type=DatabaseConn,
system_prompt='Generate PostgreSQL flavored SQL queries based on user input.',
)
@agent.result_validator
async def validate_result(ctx: RunContext[DatabaseConn], result: Response) -> Response:
if isinstance(result, InvalidRequest):
return result
try:
await ctx.deps.execute(f'EXPLAIN {result.sql_query}')
except QueryError as e:
raise ModelRetry(f'Invalid query: {e}') from e
else:
return result
result = agent.run_sync(
'get me uses who were last active yesterday.', deps=DatabaseConn()
)
print(result.data)
#> sql_query='SELECT * FROM users WHERE last_active::date = today() - interval 1 day'
(This example is complete, it can be run "as is")
There two main challenges with streamed results:
- Validating structured responses before they're complete, this is achieved by "partial validation" which was recently added to Pydantic in pydantic/pydantic#10748.
- When receiving a response, we don't know if it's the final response without starting to stream it and peeking at the content. PydanticAI streams just enough of the response to sniff out if it's a tool call or a result, then streams the whole thing and calls tools, or returns the stream as a [
StreamedRunResult
][pydantic_ai.result.StreamedRunResult].
Example of streamed text result:
from pydantic_ai import Agent
agent = Agent('gemini-1.5-flash') # (1)!
async def main():
async with agent.run_stream('Where does "hello world" come from?') as result: # (2)!
async for message in result.stream(): # (3)!
print(message)
#> The first known
#> The first known use of "hello,
#> The first known use of "hello, world" was in
#> The first known use of "hello, world" was in a 1974 textbook
#> The first known use of "hello, world" was in a 1974 textbook about the C
#> The first known use of "hello, world" was in a 1974 textbook about the C programming language.
- Streaming works with the standard [
Agent
][pydantic_ai.Agent] class, and doesn't require any special setup, just a model that supports streaming (currently all models support streaming). - The [
Agent.run_stream()
][pydantic_ai.Agent.run_stream] method is used to start a streamed run, this method returns a context manager so the connection can be closed when the stream completes. - Each item yield by [
StreamedRunResult.stream()
][pydantic_ai.result.StreamedRunResult.stream] is the complete text response, extended as new data is received.
(This example is complete, it can be run "as is")
We can also stream text as deltas rather than the entire text in each item:
from pydantic_ai import Agent
agent = Agent('gemini-1.5-flash')
async def main():
async with agent.run_stream('Where does "hello world" come from?') as result:
async for message in result.stream_text(delta=True): # (1)!
print(message)
#> The first known
#> use of "hello,
#> world" was in
#> a 1974 textbook
#> about the C
#> programming language.
- [
stream_text
][pydantic_ai.result.StreamedRunResult.stream_text] will error if the response is not text
(This example is complete, it can be run "as is")
!!! warning "Result message not included in messages
"
The final result message will NOT be added to result messages if you use .stream_text(delta=True)
,
see Messages and chat history for more information.
Not all types are supported with partial validation in Pydantic, see pydantic/pydantic#10748, generally for model-like structures it's currently best to use TypeDict
.
Here's an example of streaming a use profile as it's built:
from datetime import date
from typing_extensions import TypedDict
from pydantic_ai import Agent
class UserProfile(TypedDict, total=False):
name: str
dob: date
bio: str
agent = Agent(
'openai:gpt-4o',
result_type=UserProfile,
system_prompt='Extract a user profile from the input',
)
async def main():
user_input = 'My name is Ben, I was born on January 28th 1990, I like the chain the dog and the pyramid.'
async with agent.run_stream(user_input) as result:
async for profile in result.stream():
print(profile)
#> {'name': 'Ben'}
#> {'name': 'Ben'}
#> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes'}
#> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes the chain the '}
#> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes the chain the dog and the pyr'}
#> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes the chain the dog and the pyramid'}
#> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes the chain the dog and the pyramid'}
(This example is complete, it can be run "as is")
If you want fine-grained control of validation, particularly catching validation errors, you can use the following pattern:
from datetime import date
from pydantic import ValidationError
from typing_extensions import TypedDict
from pydantic_ai import Agent
class UserProfile(TypedDict, total=False):
name: str
dob: date
bio: str
agent = Agent('openai:gpt-4o', result_type=UserProfile)
async def main():
user_input = 'My name is Ben, I was born on January 28th 1990, I like the chain the dog and the pyramid.'
async with agent.run_stream(user_input) as result:
async for message, last in result.stream_structured(debounce_by=0.01): # (1)!
try:
profile = await result.validate_structured_result( # (2)!
message,
allow_partial=not last,
)
except ValidationError:
continue
print(profile)
#> {'name': 'Ben'}
#> {'name': 'Ben'}
#> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes'}
#> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes the chain the '}
#> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes the chain the dog and the pyr'}
#> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes the chain the dog and the pyramid'}
#> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes the chain the dog and the pyramid'}
- [
stream_structured
][pydantic_ai.result.StreamedRunResult.stream_structured] streams the data as [ModelResponse
][pydantic_ai.messages.ModelResponse] objects, thus iteration can't fail with aValidationError
. - [
validate_structured_result
][pydantic_ai.result.StreamedRunResult.validate_structured_result] validates the data,allow_partial=True
enables pydantic's [experimental_allow_partial
flag onTypeAdapter
][pydantic.type_adapter.TypeAdapter.validate_json].
(This example is complete, it can be run "as is")
The following examples demonstrate how to use streamed responses in PydanticAI:
With PydanticAI and LLM integrations in general, there are two distinct kinds of test:
- Unit tests — tests of your application code, and whether it's behaving correctly
- Evals — tests of the LLM, and how good or bad its responses are
For the most part, these two kinds of tests have pretty separate goals and considerations.
Unit tests for PydanticAI code are just like unit tests for any other Python code.
Because for the most part they're nothing new, we have pretty well established tools and patterns for writing and running these kinds of tests.
Unless you're really sure you know better, you'll probably want to follow roughly this strategy:
- Use
pytest
as your test harness - If you find yourself typing out long assertions, use inline-snapshot
- Similarly, dirty-equals can be useful for comparing large data structures
- Use [
TestModel
][pydantic_ai.models.test.TestModel] or [FunctionModel
][pydantic_ai.models.function.FunctionModel] in place of your actual model to avoid the usage, latency and variability of real LLM calls - Use [
Agent.override
][pydantic_ai.agent.Agent.override] to replace your model inside your application logic - Set [
ALLOW_MODEL_REQUESTS=False
][pydantic_ai.models.ALLOW_MODEL_REQUESTS] globally to block any requests from being made to non-test models accidentally
The simplest and fastest way to exercise most of your application code is using [TestModel
][pydantic_ai.models.test.TestModel], this will (by default) call all tools in the agent, then return either plain text or a structured response depending on the return type of the agent.
!!! note "TestModel
is not magic"
The "clever" (but not too clever) part of TestModel
is that it will attempt to generate valid structured data for function tools and result types based on the schema of the registered tools.
There's no ML or AI in `TestModel`, it's just plain old procedural Python code that tries to generate data that satisfies the JSON schema of a tool.
The resulting data won't look pretty or relevant, but it should pass Pydantic's validation in most cases.
If you want something more sophisticated, use [`FunctionModel`][pydantic_ai.models.function.FunctionModel] and write your own data generation logic.
Let's write unit tests for the following application code:
import asyncio
from datetime import date
from pydantic_ai import Agent, RunContext
from fake_database import DatabaseConn # (1)!
from weather_service import WeatherService # (2)!
weather_agent = Agent(
'openai:gpt-4o',
deps_type=WeatherService,
system_prompt='Providing a weather forecast at the locations the user provides.',
)
@weather_agent.tool
def weather_forecast(
ctx: RunContext[WeatherService], location: str, forecast_date: date
) -> str:
if forecast_date < date.today(): # (3)!
return ctx.deps.get_historic_weather(location, forecast_date)
else:
return ctx.deps.get_forecast(location, forecast_date)
async def run_weather_forecast( # (3)!
user_prompts: list[tuple[str, int]], conn: DatabaseConn
):
"""Run weather forecast for a list of user prompts and save."""
async with WeatherService() as weather_service:
async def run_forecast(prompt: str, user_id: int):
result = await weather_agent.run(prompt, deps=weather_service)
await conn.store_forecast(user_id, result.data)
# run all prompts in parallel
await asyncio.gather(
*(run_forecast(prompt, user_id) for (prompt, user_id) in user_prompts)
)
DatabaseConn
is a class that holds a database connectionWeatherService
has methods to get weather forecasts and historic data about the weather- We need to call a different endpoint depending on whether the date is in the past or the future, you'll see why this nuance is important below
- This function is the code we want to test, together with the agent it uses
Here we have a function that takes a list of #!python (user_prompt, user_id)
tuples, gets a weather forecast for each prompt, and stores the result in the database.
We want to test this code without having to mock certain objects or modify our code so we can pass test objects in.
Here's how we would write tests using [TestModel
][pydantic_ai.models.test.TestModel]:
from datetime import timezone
import pytest
from dirty_equals import IsNow
from pydantic_ai import models, capture_run_messages
from pydantic_ai.models.test import TestModel
from pydantic_ai.messages import (
ArgsDict,
ModelResponse,
SystemPromptPart,
TextPart,
ToolCallPart,
ToolReturnPart,
UserPromptPart,
ModelRequest,
)
from fake_database import DatabaseConn
from weather_app import run_weather_forecast, weather_agent
pytestmark = pytest.mark.anyio # (1)!
models.ALLOW_MODEL_REQUESTS = False # (2)!
async def test_forecast():
conn = DatabaseConn()
user_id = 1
with capture_run_messages() as messages:
with weather_agent.override(model=TestModel()): # (3)!
prompt = 'What will the weather be like in London on 2024-11-28?'
await run_weather_forecast([(prompt, user_id)], conn) # (4)!
forecast = await conn.get_forecast(user_id)
assert forecast == '{"weather_forecast":"Sunny with a chance of rain"}' # (5)!
assert messages == [ # (6)!
ModelRequest(
parts=[
SystemPromptPart(
content='Providing a weather forecast at the locations the user provides.',
),
UserPromptPart(
content='What will the weather be like in London on 2024-11-28?',
timestamp=IsNow(tz=timezone.utc), # (7)!
),
]
),
ModelResponse(
parts=[
ToolCallPart(
tool_name='weather_forecast',
args=ArgsDict(
args_dict={
'location': 'a',
'forecast_date': '2024-01-01', # (8)!
}
),
tool_call_id=None,
)
],
timestamp=IsNow(tz=timezone.utc),
),
ModelRequest(
parts=[
ToolReturnPart(
tool_name='weather_forecast',
content='Sunny with a chance of rain',
tool_call_id=None,
timestamp=IsNow(tz=timezone.utc),
),
],
),
ModelResponse(
parts=[
TextPart(
content='{"weather_forecast":"Sunny with a chance of rain"}',
)
],
timestamp=IsNow(tz=timezone.utc),
),
]
- We're using anyio to run async tests.
- This is a safety measure to make sure we don't accidentally make real requests to the LLM while testing, see [
ALLOW_MODEL_REQUESTS
][pydantic_ai.models.ALLOW_MODEL_REQUESTS] for more details. - We're using [
Agent.override
][pydantic_ai.agent.Agent.override] to replace the agent's model with [TestModel
][pydantic_ai.models.test.TestModel], the nice thing aboutoverride
is that we can replace the model inside agent without needing access to the agentrun*
methods call site. - Now we call the function we want to test inside the
override
context manager. - But default,
TestModel
will return a JSON string summarising the tools calls made, and what was returned. If you wanted to customise the response to something more closely aligned with the domain, you could add [custom_result_text='Sunny'
][pydantic_ai.models.test.TestModel.custom_result_text] when definingTestModel
. - So far we don't actually know which tools were called and with which values, we can use [
capture_run_messages
][pydantic_ai.capture_run_messages] to inspect messages from the most recent run and assert the exchange between the agent and the model occurred as expected. - The [
IsNow
][dirty_equals.IsNow] helper allows us to use declarative asserts even with data which will contain timestamps that change over time. TestModel
isn't doing anything clever to extract values from the prompt, so these values are hardcoded.
The above tests are a great start, but careful readers will notice that the WeatherService.get_forecast
is never called since TestModel
calls weather_forecast
with a date in the past.
To fully exercise weather_forecast
, we need to use [FunctionModel
][pydantic_ai.models.function.FunctionModel] to customise how the tools is called.
Here's an example of using FunctionModel
to test the weather_forecast
tool with custom inputs
import re
import pytest
from pydantic_ai import models
from pydantic_ai.messages import (
ModelMessage,
ModelResponse,
ToolCallPart,
)
from pydantic_ai.models.function import AgentInfo, FunctionModel
from fake_database import DatabaseConn
from weather_app import run_weather_forecast, weather_agent
pytestmark = pytest.mark.anyio
models.ALLOW_MODEL_REQUESTS = False
def call_weather_forecast( # (1)!
messages: list[ModelMessage], info: AgentInfo
) -> ModelResponse:
if len(messages) == 1:
# first call, call the weather forecast tool
user_prompt = messages[0].parts[-1]
m = re.search(r'\d{4}-\d{2}-\d{2}', user_prompt.content)
assert m is not None
args = {'location': 'London', 'forecast_date': m.group()} # (2)!
return ModelResponse(
parts=[ToolCallPart.from_raw_args('weather_forecast', args)]
)
else:
# second call, return the forecast
msg = messages[-1].parts[0]
assert msg.part_kind == 'tool-return'
return ModelResponse.from_text(f'The forecast is: {msg.content}')
async def test_forecast_future():
conn = DatabaseConn()
user_id = 1
with weather_agent.override(model=FunctionModel(call_weather_forecast)): # (3)!
prompt = 'What will the weather be like in London on 2032-01-01?'
await run_weather_forecast([(prompt, user_id)], conn)
forecast = await conn.get_forecast(user_id)
assert forecast == 'The forecast is: Rainy with a chance of sun'
- We define a function
call_weather_forecast
that will be called byFunctionModel
in place of the LLM, this function has access to the list of [ModelMessage
][pydantic_ai.messages.ModelMessage]s that make up the run, and [AgentInfo
][pydantic_ai.models.function.AgentInfo] which contains information about the agent and the function tools and return tools. - Our function is slightly intelligent in that it tries to extract a date from the prompt, but just hard codes the location.
- We use [
FunctionModel
][pydantic_ai.models.function.FunctionModel] to replace the agent's model with our custom function.
If you're writing lots of tests that all require model to be overridden, you can use pytest fixtures to override the model with [TestModel
][pydantic_ai.models.test.TestModel] or [FunctionModel
][pydantic_ai.models.function.FunctionModel] in a reusable way.
Here's an example of a fixture that overrides the model with TestModel
:
import pytest
from weather_app import weather_agent
from pydantic_ai.models.test import TestModel
@pytest.fixture
def override_weather_agent():
with weather_agent.override(model=TestModel()):
yield
async def test_forecast(override_weather_agent: None):
...
# test code here
"Evals" refers to evaluating a models performance for a specific application.
!!! danger "Warning" Unlike unit tests, evals are an emerging art/science; anyone who claims to know for sure exactly how your evals should be defined can safely be ignored.
Evals are generally more like benchmarks than unit tests, they never "pass" although they do "fail"; you care mostly about how they change over time.
Since evals need to be run against the real model, then can be slow and expensive to run, you generally won't want to run them in CI for every commit.
The hardest part of evals is measuring how well the model has performed.
In some cases (e.g. an agent to generate SQL) there are simple, easy to run tests that can be used to measure performance (e.g. is the SQL valid? Does it return the right results? Does it return just the right results?).
In other cases (e.g. an agent that gives advice on quitting smoking) it can be very hard or impossible to make quantitative measures of performance — in the smoking case you'd really need to run a double-blind trial over months, then wait 40 years and observe health outcomes to know if changes to your prompt were an improvement.
There are a few different strategies you can use to measure performance:
- End to end, self-contained tests — like the SQL example, we can test the final result of the agent near-instantly
- Synthetic self-contained tests — writing unit test style checks that the output is as expected, checks like
#!python 'chewing gum' in response
, while these checks might seem simplistic they can be helpful, one nice characteristic is that it's easy to tell what's wrong when they fail - LLMs evaluating LLMs — using another models, or even the same model with a different prompt to evaluate the performance of the agent (like when the class marks each other's homework because the teacher has a hangover), while the downsides and complexities of this approach are obvious, some think it can be a useful tool in the right circumstances
- Evals in prod — measuring the end results of the agent in production, then creating a quantitative measure of performance, so you can easily measure changes over time as you change the prompt or model used, logfire can be extremely useful in this case since you can write a custom query to measure the performance of your agent
The system prompt is the developer's primary tool in controlling an agent's behavior, so it's often useful to be able to customise the system prompt and see how performance changes. This is particularly relevant when the system prompt contains a list of examples and you want to understand how changing that list affects the model's performance.
Let's assume we have the following app for running SQL generated from a user prompt (this examples omits a lot of details for brevity, see the SQL gen example for a more complete code):
import json
from pathlib import Path
from typing import Union
from pydantic_ai import Agent, RunContext
from fake_database import DatabaseConn
class SqlSystemPrompt: # (1)!
def __init__(
self, examples: Union[list[dict[str, str]], None] = None, db: str = 'PostgreSQL'
):
if examples is None:
# if examples aren't provided, load them from file, this is the default
with Path('examples.json').open('rb') as f:
self.examples = json.load(f)
else:
self.examples = examples
self.db = db
def build_prompt(self) -> str: # (2)!
return f"""\
Given the following {self.db} table of records, your job is to
write a SQL query that suits the user's request.
Database schema:
CREATE TABLE records (
...
);
{''.join(self.format_example(example) for example in self.examples)}
"""
@staticmethod
def format_example(example: dict[str, str]) -> str: # (3)!
return f"""\
<example>
<request>{example['request']}</request>
<sql>{example['sql']}</sql>
</example>
"""
sql_agent = Agent(
'gemini-1.5-flash',
deps_type=SqlSystemPrompt,
)
@sql_agent.system_prompt
async def system_prompt(ctx: RunContext[SqlSystemPrompt]) -> str:
return ctx.deps.build_prompt()
async def user_search(user_prompt: str) -> list[dict[str, str]]:
"""Search the database based on the user's prompts."""
... # (4)!
result = await sql_agent.run(user_prompt, deps=SqlSystemPrompt())
conn = DatabaseConn()
return await conn.execute(result.data)
- The
SqlSystemPrompt
class is used to build the system prompt, it can be customised with a list of examples and a database type. We implement this as a separate class passed as a dep to the agent so we can override both the inputs and the logic during evals via dependency injection. - The
build_prompt
method constructs the system prompt from the examples and the database type. - Some people think that LLMs are more likely to generate good responses if examples are formatted as XML as it's to identify the end of a string, see #93.
- In reality, you would have more logic here, making it impractical to run the agent independently of the wider application.
examples.json
looks something like this:
request: show me error records with the tag "foobar"
response: SELECT * FROM records WHERE level = 'error' and 'foobar' = ANY(tags)
{
"examples": [
{
"request": "Show me all records",
"sql": "SELECT * FROM records;"
},
{
"request": "Show me all records from 2021",
"sql": "SELECT * FROM records WHERE date_trunc('year', date) = '2021-01-01';"
},
{
"request": "show me error records with the tag 'foobar'",
"sql": "SELECT * FROM records WHERE level = 'error' and 'foobar' = ANY(tags);"
},
...
]
}
Now we want a way to quantify the success of the SQL generation so we can judge how changes to the agent affect its performance.
We can use [Agent.override
][pydantic_ai.agent.Agent.override] to replace the system prompt with a custom one that uses a subset of examples, and then run the application code (in this case user_search
). We also run the actual SQL from the examples and compare the "correct" result from the example SQL to the SQL generated by the agent. (We compare the results of running the SQL rather than the SQL itself since the SQL might be semantically equivalent but written in a different way).
To get a quantitative measure of performance, we assign points to each run as follows:
- -100 points if the generated SQL is invalid
- -1 point for each row returned by the agent (so returning lots of results is discouraged)
- +5 points for each row returned by the agent that matches the expected result
We use 5-fold cross-validation to judge the performance of the agent using our existing set of examples.
import json
import statistics
from pathlib import Path
from itertools import chain
from fake_database import DatabaseConn, QueryError
from sql_app import sql_agent, SqlSystemPrompt, user_search
async def main():
with Path('examples.json').open('rb') as f:
examples = json.load(f)
# split examples into 5 folds
fold_size = len(examples) // 5
folds = [examples[i : i + fold_size] for i in range(0, len(examples), fold_size)]
conn = DatabaseConn()
scores = []
for i, fold in enumerate(folds, start=1):
fold_score = 0
# build all other folds into a list of examples
other_folds = list(chain(*(f for j, f in enumerate(folds) if j != i)))
# create a new system prompt with the other fold examples
system_prompt = SqlSystemPrompt(examples=other_folds)
# override the system prompt with the new one
with sql_agent.override(deps=system_prompt):
for case in fold:
try:
agent_results = await user_search(case['request'])
except QueryError as e:
print(f'Fold {i} {case}: {e}')
fold_score -= 100
else:
# get the expected results using the SQL from this case
expected_results = await conn.execute(case['sql'])
agent_ids = [r['id'] for r in agent_results]
# each returned value has a score of -1
fold_score -= len(agent_ids)
expected_ids = {r['id'] for r in expected_results}
# each return value that matches the expected value has a score of 3
fold_score += 5 * len(set(agent_ids) & expected_ids)
scores.append(fold_score)
overall_score = statistics.mean(scores)
print(f'Overall score: {overall_score:0.2f}')
#> Overall score: 12.00
We can then change the prompt, the model, or the examples and see how the score changes over time.
Agents are PydanticAI's primary interface for interacting with LLMs.
In some use cases a single Agent will control an entire application or component, but multiple agents can also interact to embody more complex workflows.
The [Agent
][pydantic_ai.Agent] class has full API documentation, but conceptually you can think of an agent as a container for:
Component | Description |
---|---|
System prompt(s) | A set of instructions for the LLM written by the developer. |
Function tool(s) | Functions that the LLM may call to get information while generating a response. |
Structured result type | The structured datatype the LLM must return at the end of a run, if specified. |
Dependency type constraint | System prompt functions, tools, and result validators may all use dependencies when they're run. |
LLM model | Optional default LLM model associated with the agent. Can also be specified when running the agent. |
Model Settings | Optional default model settings to help fine tune requests. Can also be specified when running the agent. |
In typing terms, agents are generic in their dependency and result types, e.g., an agent which required dependencies of type #!python Foobar
and returned results of type #!python list[str]
would have type Agent[Foobar, list[str]]
. In practice, you shouldn't need to care about this, it should just mean your IDE can tell you when you have the right type, and if you choose to use static type checking it should work well with PydanticAI.
Here's a toy example of an agent that simulates a roulette wheel:
from pydantic_ai import Agent, RunContext
roulette_agent = Agent( # (1)!
'openai:gpt-4o',
deps_type=int,
result_type=bool,
system_prompt=(
'Use the `roulette_wheel` function to see if the '
'customer has won based on the number they provide.'
),
)
@roulette_agent.tool
async def roulette_wheel(ctx: RunContext[int], square: int) -> str: # (2)!
"""check if the square is a winner"""
return 'winner' if square == ctx.deps else 'loser'
# Run the agent
success_number = 18 # (3)!
result = roulette_agent.run_sync('Put my money on square eighteen', deps=success_number)
print(result.data) # (4)!
#> True
result = roulette_agent.run_sync('I bet five is the winner', deps=success_number)
print(result.data)
#> False
- Create an agent, which expects an integer dependency and returns a boolean result. This agent will have type
#!python Agent[int, bool]
. - Define a tool that checks if the square is a winner. Here [
RunContext
][pydantic_ai.tools.RunContext] is parameterized with the dependency typeint
; if you got the dependency type wrong you'd get a typing error. - In reality, you might want to use a random number here e.g.
random.randint(0, 36)
. result.data
will be a boolean indicating if the square is a winner. Pydantic performs the result validation, it'll be typed as abool
since its type is derived from theresult_type
generic parameter of the agent.
!!! tip "Agents are designed for reuse, like FastAPI Apps" Agents are intended to be instantiated once (frequently as module globals) and reused throughout your application, similar to a small [FastAPI][fastapi.FastAPI] app or an [APIRouter][fastapi.APIRouter].
There are three ways to run an agent:
- [
agent.run()
][pydantic_ai.Agent.run] — a coroutine which returns a [RunResult
][pydantic_ai.result.RunResult] containing a completed response - [
agent.run_sync()
][pydantic_ai.Agent.run_sync] — a plain, synchronous function which returns a [RunResult
][pydantic_ai.result.RunResult] containing a completed response (internally, this just callsloop.run_until_complete(self.run())
) - [
agent.run_stream()
][pydantic_ai.Agent.run_stream] — a coroutine which returns a [StreamedRunResult
][pydantic_ai.result.StreamedRunResult], which contains methods to stream a response as an async iterable
Here's a simple example demonstrating all three:
from pydantic_ai import Agent
agent = Agent('openai:gpt-4o')
result_sync = agent.run_sync('What is the capital of Italy?')
print(result_sync.data)
#> Rome
async def main():
result = await agent.run('What is the capital of France?')
print(result.data)
#> Paris
async with agent.run_stream('What is the capital of the UK?') as response:
print(await response.get_data())
#> London
(This example is complete, it can be run "as is")
You can also pass messages from previous runs to continue a conversation or provide context, as described in Messages and Chat History.
PydanticAI offers a [UsageLimits
][pydantic_ai.usage.UsageLimits] structure to help you limit your
usage (tokens and/or requests) on model runs.
You can apply these settings by passing the usage_limits
argument to the run{_sync,_stream}
functions.
Consider the following example, where we limit the number of response tokens:
from pydantic_ai import Agent
from pydantic_ai.exceptions import UsageLimitExceeded
from pydantic_ai.usage import UsageLimits
agent = Agent('claude-3-5-sonnet-latest')
result_sync = agent.run_sync(
'What is the capital of Italy? Answer with just the city.',
usage_limits=UsageLimits(response_tokens_limit=10),
)
print(result_sync.data)
#> Rome
print(result_sync.usage())
"""
Usage(requests=1, request_tokens=62, response_tokens=1, total_tokens=63, details=None)
"""
try:
result_sync = agent.run_sync(
'What is the capital of Italy? Answer with a paragraph.',
usage_limits=UsageLimits(response_tokens_limit=10),
)
except UsageLimitExceeded as e:
print(e)
#> Exceeded the response_tokens_limit of 10 (response_tokens=32)
Restricting the number of requests can be useful in preventing infinite loops or excessive tool calling:
from typing_extensions import TypedDict
from pydantic_ai import Agent, ModelRetry
from pydantic_ai.exceptions import UsageLimitExceeded
from pydantic_ai.usage import UsageLimits
class NeverResultType(TypedDict):
"""
Never ever coerce data to this type.
"""
never_use_this: str
agent = Agent(
'claude-3-5-sonnet-latest',
result_type=NeverResultType,
system_prompt='Any time you get a response, call the `infinite_retry_tool` to produce another response.',
)
@agent.tool_plain(retries=5) # (1)!
def infinite_retry_tool() -> int:
raise ModelRetry('Please try again.')
try:
result_sync = agent.run_sync(
'Begin infinite retry loop!', usage_limits=UsageLimits(request_limit=3) # (2)!
)
except UsageLimitExceeded as e:
print(e)
#> The next request would exceed the request_limit of 3
- This tool has the ability to retry 5 times before erroring, simulating a tool that might get stuck in a loop.
- This run will error after 3 requests, preventing the infinite tool calling.
!!! note
This is especially relevant if you're registered a lot of tools, request_limit
can be used to prevent the model from choosing to make too many of these calls.
PydanticAI offers a [settings.ModelSettings
][pydantic_ai.settings.ModelSettings] structure to help you fine tune your requests.
This structure allows you to configure common parameters that influence the model's behavior, such as temperature
, max_tokens
,
timeout
, and more.
There are two ways to apply these settings:
- Passing to
run{_sync,_stream}
functions via themodel_settings
argument. This allows for fine-tuning on a per-request basis. - Setting during [
Agent
][pydantic_ai.agent.Agent] initialization via themodel_settings
argument. These settings will be applied by default to all subsequent run calls using said agent. However,model_settings
provided during a specific run call will override the agent's default settings.
For example, if you'd like to set the temperature
setting to 0.0
to ensure less random behavior,
you can do the following:
from pydantic_ai import Agent
agent = Agent('openai:gpt-4o')
result_sync = agent.run_sync(
'What is the capital of Italy?', model_settings={'temperature': 0.0}
)
print(result_sync.data)
#> Rome
An agent run might represent an entire conversation — there's no limit to how many messages can be exchanged in a single run. However, a conversation might also be composed of multiple runs, especially if you need to maintain state between separate interactions or API calls.
Here's an example of a conversation comprised of multiple runs:
from pydantic_ai import Agent
agent = Agent('openai:gpt-4o')
# First run
result1 = agent.run_sync('Who was Albert Einstein?')
print(result1.data)
#> Albert Einstein was a German-born theoretical physicist.
# Second run, passing previous messages
result2 = agent.run_sync(
'What was his most famous equation?',
message_history=result1.new_messages(), # (1)!
)
print(result2.data)
#> Albert Einstein's most famous equation is (E = mc^2).
- Continue the conversation; without
message_history
the model would not know who "his" was referring to.
(This example is complete, it can be run "as is")
PydanticAI is designed to work well with static type checkers, like mypy and pyright.
!!! tip "Typing is (somewhat) optional" PydanticAI is designed to make type checking as useful as possible for you if you choose to use it, but you don't have to use types everywhere all the time.
That said, because PydanticAI uses Pydantic, and Pydantic uses type hints as the definition for schema and validation, some types (specifically type hints on parameters to tools, and the `result_type` arguments to [`Agent`][pydantic_ai.Agent]) are used at runtime.
We (the library developers) have messed up if type hints are confusing you more than helping you, if you find this, please create an [issue](https://github.com/pydantic/pydantic-ai/issues) explaining what's annoying you!
In particular, agents are generic in both the type of their dependencies and the type of results they return, so you can use the type hints to ensure you're using the right types.
Consider the following script with type mistakes:
from dataclasses import dataclass
from pydantic_ai import Agent, RunContext
@dataclass
class User:
name: str
agent = Agent(
'test',
deps_type=User, # (1)!
result_type=bool,
)
@agent.system_prompt
def add_user_name(ctx: RunContext[str]) -> str: # (2)!
return f"The user's name is {ctx.deps}."
def foobar(x: bytes) -> None:
pass
result = agent.run_sync('Does their name start with "A"?', deps=User('Anne'))
foobar(result.data) # (3)!
- The agent is defined as expecting an instance of
User
asdeps
. - But here
add_user_name
is defined as taking astr
as the dependency, not aUser
. - Since the agent is defined as returning a
bool
, this will raise a type error sincefoobar
expectsbytes
.
Running mypy
on this will give the following output:
➤ uv run mypy type_mistakes.py
type_mistakes.py:18: error: Argument 1 to "system_prompt" of "Agent" has incompatible type "Callable[[RunContext[str]], str]"; expected "Callable[[RunContext[User]], str]" [arg-type]
type_mistakes.py:28: error: Argument 1 to "foobar" has incompatible type "bool"; expected "bytes" [arg-type]
Found 2 errors in 1 file (checked 1 source file)
Running pyright
would identify the same issues.
System prompts might seem simple at first glance since they're just strings (or sequences of strings that are concatenated), but crafting the right system prompt is key to getting the model to behave as you want.
Generally, system prompts fall into two categories:
- Static system prompts: These are known when writing the code and can be defined via the
system_prompt
parameter of the [Agent
constructor][pydantic_ai.Agent.init]. - Dynamic system prompts: These depend in some way on context that isn't known until runtime, and should be defined via functions decorated with [
@agent.system_prompt
][pydantic_ai.Agent.system_prompt].
You can add both to a single agent; they're appended in the order they're defined at runtime.
Here's an example using both types of system prompts:
from datetime import date
from pydantic_ai import Agent, RunContext
agent = Agent(
'openai:gpt-4o',
deps_type=str, # (1)!
system_prompt="Use the customer's name while replying to them.", # (2)!
)
@agent.system_prompt # (3)!
def add_the_users_name(ctx: RunContext[str]) -> str:
return f"The user's name is {ctx.deps}."
@agent.system_prompt
def add_the_date() -> str: # (4)!
return f'The date is {date.today()}.'
result = agent.run_sync('What is the date?', deps='Frank')
print(result.data)
#> Hello Frank, the date today is 2032-01-02.
- The agent expects a string dependency.
- Static system prompt defined at agent creation time.
- Dynamic system prompt defined via a decorator with [
RunContext
][pydantic_ai.tools.RunContext], this is called just afterrun_sync
, not when the agent is created, so can benefit from runtime information like the dependencies used on that run. - Another dynamic system prompt, system prompts don't have to have the
RunContext
parameter.
(This example is complete, it can be run "as is")
Validation errors from both function tool parameter validation and structured result validation can be passed back to the model with a request to retry.
You can also raise [ModelRetry
][pydantic_ai.exceptions.ModelRetry] from within a tool or result validator function to tell the model it should retry generating a response.
- The default retry count is 1 but can be altered for the [entire agent][pydantic_ai.Agent.init], a [specific tool][pydantic_ai.Agent.tool], or a [result validator][pydantic_ai.Agent.init].
- You can access the current retry count from within a tool or result validator via [
ctx.retry
][pydantic_ai.tools.RunContext].
Here's an example:
from pydantic import BaseModel
from pydantic_ai import Agent, RunContext, ModelRetry
from fake_database import DatabaseConn
class ChatResult(BaseModel):
user_id: int
message: str
agent = Agent(
'openai:gpt-4o',
deps_type=DatabaseConn,
result_type=ChatResult,
)
@agent.tool(retries=2)
def get_user_by_name(ctx: RunContext[DatabaseConn], name: str) -> int:
"""Get a user's ID from their full name."""
print(name)
#> John
#> John Doe
user_id = ctx.deps.users.get(name=name)
if user_id is None:
raise ModelRetry(
f'No user found with name {name!r}, remember to provide their full name'
)
return user_id
result = agent.run_sync(
'Send a message to John Doe asking for coffee next week', deps=DatabaseConn()
)
print(result.data)
"""
user_id=123 message='Hello John, would you be free for coffee sometime next week? Let me know what works for you!'
"""
If models behave unexpectedly (e.g., the retry limit is exceeded, or their API returns 503
), agent runs will raise [UnexpectedModelBehavior
][pydantic_ai.exceptions.UnexpectedModelBehavior].
In these cases, [capture_run_messages
][pydantic_ai.capture_run_messages] can be used to access the messages exchanged during the run to help diagnose the issue.
from pydantic_ai import Agent, ModelRetry, UnexpectedModelBehavior, capture_run_messages
agent = Agent('openai:gpt-4o')
@agent.tool_plain
def calc_volume(size: int) -> int: # (1)!
if size == 42:
return size**3
else:
raise ModelRetry('Please try again.')
with capture_run_messages() as messages: # (2)!
try:
result = agent.run_sync('Please get me the volume of a box with size 6.')
except UnexpectedModelBehavior as e:
print('An error occurred:', e)
#> An error occurred: Tool exceeded max retries count of 1
print('cause:', repr(e.__cause__))
#> cause: ModelRetry('Please try again.')
print('messages:', messages)
"""
messages:
[
ModelRequest(
parts=[
UserPromptPart(
content='Please get me the volume of a box with size 6.',
timestamp=datetime.datetime(...),
part_kind='user-prompt',
)
],
kind='request',
),
ModelResponse(
parts=[
ToolCallPart(
tool_name='calc_volume',
args=ArgsDict(args_dict={'size': 6}),
tool_call_id=None,
part_kind='tool-call',
)
],
timestamp=datetime.datetime(...),
kind='response',
),
ModelRequest(
parts=[
RetryPromptPart(
content='Please try again.',
tool_name='calc_volume',
tool_call_id=None,
timestamp=datetime.datetime(...),
part_kind='retry-prompt',
)
],
kind='request',
),
ModelResponse(
parts=[
ToolCallPart(
tool_name='calc_volume',
args=ArgsDict(args_dict={'size': 6}),
tool_call_id=None,
part_kind='tool-call',
)
],
timestamp=datetime.datetime(...),
kind='response',
),
]
"""
else:
print(result.data)
- Define a tool that will raise
ModelRetry
repeatedly in this case. - [
capture_run_messages
][pydantic_ai.capture_run_messages] is used to capture the messages exchanged during the run.
(This example is complete, it can be run "as is")
!!! note
If you call [run
][pydantic_ai.Agent.run], [run_sync
][pydantic_ai.Agent.run_sync], or [run_stream
][pydantic_ai.Agent.run_stream] more than once within a single capture_run_messages
context, messages
will represent the messages exchanged during the first call only.
Below are suggestions on how to fix some common errors you might encounter while using PydanticAI. If the issue you're experiencing is not listed below or addressed in the documentation, please feel free to ask in the Pydantic Slack or create an issue on GitHub.
This error is caused by conflicts between the event loops in Jupyter notebook and PydanticAI's. One way to manage these conflicts is by using nest-asyncio
. Namely, before you execute any agent runs, do the following:
import nest_asyncio
nest_asyncio.apply()
Note: This fix also applies to Google Colab.
If you're running into issues with setting the API key for your model, visit the Models page to learn more about how to set an environment variable and/or pass in an api_key
argument.
PydanticAI provides access to messages exchanged during an agent run. These messages can be used both to continue a coherent conversation, and to understand how an agent performed.
After running an agent, you can access the messages exchanged during that run from the result
object.
Both [RunResult
][pydantic_ai.result.RunResult]
(returned by [Agent.run
][pydantic_ai.Agent.run], [Agent.run_sync
][pydantic_ai.Agent.run_sync])
and [StreamedRunResult
][pydantic_ai.result.StreamedRunResult] (returned by [Agent.run_stream
][pydantic_ai.Agent.run_stream]) have the following methods:
- [
all_messages()
][pydantic_ai.result.RunResult.all_messages]: returns all messages, including messages from prior runs. There's also a variant that returns JSON bytes, [all_messages_json()
][pydantic_ai.result.RunResult.all_messages_json]. - [
new_messages()
][pydantic_ai.result.RunResult.new_messages]: returns only the messages from the current run. There's also a variant that returns JSON bytes, [new_messages_json()
][pydantic_ai.result.RunResult.new_messages_json].
!!! info "StreamedRunResult and complete messages"
On [StreamedRunResult
][pydantic_ai.result.StreamedRunResult], the messages returned from these methods will only include the final result message once the stream has finished.
E.g. you've awaited one of the following coroutines:
* [`StreamedRunResult.stream()`][pydantic_ai.result.StreamedRunResult.stream]
* [`StreamedRunResult.stream_text()`][pydantic_ai.result.StreamedRunResult.stream_text]
* [`StreamedRunResult.stream_structured()`][pydantic_ai.result.StreamedRunResult.stream_structured]
* [`StreamedRunResult.get_data()`][pydantic_ai.result.StreamedRunResult.get_data]
**Note:** The final result message will NOT be added to result messages if you use [`.stream_text(delta=True)`][pydantic_ai.result.StreamedRunResult.stream_text] since in this case the result content is never built as one string.
Example of accessing methods on a [RunResult
][pydantic_ai.result.RunResult] :
from pydantic_ai import Agent
agent = Agent('openai:gpt-4o', system_prompt='Be a helpful assistant.')
result = agent.run_sync('Tell me a joke.')
print(result.data)
#> Did you hear about the toothpaste scandal? They called it Colgate.
# all messages from the run
print(result.all_messages())
"""
[
ModelRequest(
parts=[
SystemPromptPart(
content='Be a helpful assistant.', part_kind='system-prompt'
),
UserPromptPart(
content='Tell me a joke.',
timestamp=datetime.datetime(...),
part_kind='user-prompt',
),
],
kind='request',
),
ModelResponse(
parts=[
TextPart(
content='Did you hear about the toothpaste scandal? They called it Colgate.',
part_kind='text',
)
],
timestamp=datetime.datetime(...),
kind='response',
),
]
"""
(This example is complete, it can be run "as is")
Example of accessing methods on a [StreamedRunResult
][pydantic_ai.result.StreamedRunResult] :
from pydantic_ai import Agent
agent = Agent('openai:gpt-4o', system_prompt='Be a helpful assistant.')
async def main():
async with agent.run_stream('Tell me a joke.') as result:
# incomplete messages before the stream finishes
print(result.all_messages())
"""
[
ModelRequest(
parts=[
SystemPromptPart(
content='Be a helpful assistant.', part_kind='system-prompt'
),
UserPromptPart(
content='Tell me a joke.',
timestamp=datetime.datetime(...),
part_kind='user-prompt',
),
],
kind='request',
)
]
"""
async for text in result.stream():
print(text)
#> Did you hear
#> Did you hear about the toothpaste
#> Did you hear about the toothpaste scandal? They called
#> Did you hear about the toothpaste scandal? They called it Colgate.
# complete messages once the stream finishes
print(result.all_messages())
"""
[
ModelRequest(
parts=[
SystemPromptPart(
content='Be a helpful assistant.', part_kind='system-prompt'
),
UserPromptPart(
content='Tell me a joke.',
timestamp=datetime.datetime(...),
part_kind='user-prompt',
),
],
kind='request',
),
ModelResponse(
parts=[
TextPart(
content='Did you hear about the toothpaste scandal? They called it Colgate.',
part_kind='text',
)
],
timestamp=datetime.datetime(...),
kind='response',
),
]
"""
(This example is complete, it can be run "as is")
The primary use of message histories in PydanticAI is to maintain context across multiple agent runs.
To use existing messages in a run, pass them to the message_history
parameter of
[Agent.run
][pydantic_ai.Agent.run], [Agent.run_sync
][pydantic_ai.Agent.run_sync] or
[Agent.run_stream
][pydantic_ai.Agent.run_stream].
If message_history
is set and not empty, a new system prompt is not generated — we assume the existing message history includes a system prompt.
from pydantic_ai import Agent
agent = Agent('openai:gpt-4o', system_prompt='Be a helpful assistant.')
result1 = agent.run_sync('Tell me a joke.')
print(result1.data)
#> Did you hear about the toothpaste scandal? They called it Colgate.
result2 = agent.run_sync('Explain?', message_history=result1.new_messages())
print(result2.data)
#> This is an excellent joke invent by Samuel Colvin, it needs no explanation.
print(result2.all_messages())
"""
[
ModelRequest(
parts=[
SystemPromptPart(
content='Be a helpful assistant.', part_kind='system-prompt'
),
UserPromptPart(
content='Tell me a joke.',
timestamp=datetime.datetime(...),
part_kind='user-prompt',
),
],
kind='request',
),
ModelResponse(
parts=[
TextPart(
content='Did you hear about the toothpaste scandal? They called it Colgate.',
part_kind='text',
)
],
timestamp=datetime.datetime(...),
kind='response',
),
ModelRequest(
parts=[
UserPromptPart(
content='Explain?',
timestamp=datetime.datetime(...),
part_kind='user-prompt',
)
],
kind='request',
),
ModelResponse(
parts=[
TextPart(
content='This is an excellent joke invent by Samuel Colvin, it needs no explanation.',
part_kind='text',
)
],
timestamp=datetime.datetime(...),
kind='response',
),
]
"""
(This example is complete, it can be run "as is")
Since messages are defined by simple dataclasses, you can manually create and manipulate, e.g. for testing.
The message format is independent of the model used, so you can use messages in different agents, or the same agent with different models.
from pydantic_ai import Agent
agent = Agent('openai:gpt-4o', system_prompt='Be a helpful assistant.')
result1 = agent.run_sync('Tell me a joke.')
print(result1.data)
#> Did you hear about the toothpaste scandal? They called it Colgate.
result2 = agent.run_sync(
'Explain?', model='gemini-1.5-pro', message_history=result1.new_messages()
)
print(result2.data)
#> This is an excellent joke invent by Samuel Colvin, it needs no explanation.
print(result2.all_messages())
"""
[
ModelRequest(
parts=[
SystemPromptPart(
content='Be a helpful assistant.', part_kind='system-prompt'
),
UserPromptPart(
content='Tell me a joke.',
timestamp=datetime.datetime(...),
part_kind='user-prompt',
),
],
kind='request',
),
ModelResponse(
parts=[
TextPart(
content='Did you hear about the toothpaste scandal? They called it Colgate.',
part_kind='text',
)
],
timestamp=datetime.datetime(...),
kind='response',
),
ModelRequest(
parts=[
UserPromptPart(
content='Explain?',
timestamp=datetime.datetime(...),
part_kind='user-prompt',
)
],
kind='request',
),
ModelResponse(
parts=[
TextPart(
content='This is an excellent joke invent by Samuel Colvin, it needs no explanation.',
part_kind='text',
)
],
timestamp=datetime.datetime(...),
kind='response',
),
]
"""
For a more complete example of using messages in conversations, see the chat app example.
Applications that use LLMs have some challenges that are well known and understood: LLMs are slow, unreliable and expensive.
These applications also have some challenges that most developers have encountered much less often: LLMs are fickle and non-deterministic. Subtle changes in a prompt can completely change a model's performance, and there's no EXPLAIN
query you can run to understand why.
!!! danger "Warning" From a software engineers point of view, you can think of LLMs as the worst database you've ever heard of, but worse.
If LLMs weren't so bloody useful, we'd never touch them.
To build successful applications with LLMs, we need new tools to understand both model performance, and the behavior of applications that rely on them.
LLM Observability tools that just let you understand how your model is performing are useless: making API calls to an LLM is easy, it's building that into an application that's hard.
Pydantic Logfire is an observability platform developed by the team who created and maintain Pydantic and PydanticAI. Logfire aims to let you understand your entire application: Gen AI, classic predictive AI, HTTP traffic, database queries and everything else a modern application needs.
!!! tip "Pydantic Logfire is a commercial product" Logfire is a commercially supported, hosted platform with an extremely generous and perpetual free tier. You can sign up and start using Logfire in a couple of minutes.
PydanticAI has built-in (but optional) support for Logfire via the logfire-api
no-op package.
That means if the logfire
package is installed and configured, detailed information about agent runs is sent to Logfire. But if the logfire
package is not installed, there's virtually no overhead and nothing is sent.
Here's an example showing details of running the Weather Agent in Logfire:
To use logfire, you'll need a logfire account, and logfire installed:
pip/uv-add 'pydantic-ai[logfire]'
Then authenticate your local environment with logfire:
py-cli logfire auth
And configure a project to send data to:
py-cli logfire projects new
(Or use an existing project with logfire projects use
)
The last step is to add logfire to your code:
import logfire
logfire.configure()
The logfire documentation has more details on how to use logfire, including how to instrument other libraries like Pydantic, HTTPX and FastAPI.
Since Logfire is build on OpenTelemetry, you can use the Logfire Python SDK to send data to any OpenTelemetry collector.
Once you have logfire set up, there are two primary ways it can help you understand your application:
- Debugging — Using the live view to see what's happening in your application in real-time.
- Monitoring — Using SQL and dashboards to observe the behavior of your application, Logfire is effectively a SQL database that stores information about how your application is running.
To demonstrate how Logfire can let you visualise the flow of a PydanticAI run, here's the view you get from Logfire while running the chat app examples:
{{ video('a764aff5840534dc77eba7d028707bfa', 25) }}
We can also query data with SQL in Logfire to monitor the performance of an application. Here's a real world example of using Logfire to monitor PydanticAI runs inside Logfire itself:
::: pydantic_ai.tools
::: pydantic_ai.settings options: inherited_members: true members: - ModelSettings - UsageLimits
::: pydantic_ai.format_as_xml
::: pydantic_ai.exceptions
The structure of [ModelMessage
][pydantic_ai.messages.ModelMessage] can be shown as a graph:
graph RL
SystemPromptPart(SystemPromptPart) --- ModelRequestPart
UserPromptPart(UserPromptPart) --- ModelRequestPart
ToolReturnPart(ToolReturnPart) --- ModelRequestPart
RetryPromptPart(RetryPromptPart) --- ModelRequestPart
TextPart(TextPart) --- ModelResponsePart
ToolCallPart(ToolCallPart) --- ModelResponsePart
ModelRequestPart("ModelRequestPart<br>(Union)") --- ModelRequest
ModelRequest("ModelRequest(parts=list[...])") --- ModelMessage
ModelResponsePart("ModelResponsePart<br>(Union)") --- ModelResponse
ModelResponse("ModelResponse(parts=list[...])") --- ModelMessage("ModelMessage<br>(Union)")
::: pydantic_ai.messages
::: pydantic_ai.agent
::: pydantic_ai.result options: inherited_members: true
::: pydantic_ai.usage
For details on how to set up authentication with this model, see model configuration for OpenAI.
::: pydantic_ai.models.openai
Custom interface to the *-aiplatform.googleapis.com
API for Gemini models.
This model uses [GeminiAgentModel
][pydantic_ai.models.gemini.GeminiAgentModel] with just the URL and auth method
changed from [GeminiModel
][pydantic_ai.models.gemini.GeminiModel], it relies on the VertexAI
generateContent
and
streamGenerateContent
function endpoints
having the same schemas as the equivalent [Gemini endpoints][pydantic_ai.models.gemini.GeminiModel].
For details on how to set up authentication with this model as well as a comparison with the generativelanguage.googleapis.com
API used by [GeminiModel
][pydantic_ai.models.gemini.GeminiModel],
see model configuration for Gemini via VertexAI.
With the default google project already configured in your environment using "application default credentials":
from pydantic_ai import Agent
from pydantic_ai.models.vertexai import VertexAIModel
model = VertexAIModel('gemini-1.5-flash')
agent = Agent(model)
result = agent.run_sync('Tell me a joke.')
print(result.data)
#> Did you hear about the toothpaste scandal? They called it Colgate.
Or using a service account JSON file:
from pydantic_ai import Agent
from pydantic_ai.models.vertexai import VertexAIModel
model = VertexAIModel(
'gemini-1.5-flash',
service_account_file='path/to/service-account.json',
)
agent = Agent(model)
result = agent.run_sync('Tell me a joke.')
print(result.data)
#> Did you hear about the toothpaste scandal? They called it Colgate.
::: pydantic_ai.models.vertexai
Custom interface to the generativelanguage.googleapis.com
API using
HTTPX and Pydantic.
The Google SDK for interacting with the generativelanguage.googleapis.com
API
google-generativeai
reads like it was written by a
Java developer who thought they knew everything about OOP, spent 30 minutes trying to learn Python,
gave up and decided to build the library to prove how horrible Python is. It also doesn't use httpx for HTTP requests,
and tries to implement tool calling itself, but doesn't use Pydantic or equivalent for validation.
We therefore implement support for the API directly.
Despite these shortcomings, the Gemini model is actually quite powerful and very fast.
For details on how to set up authentication with this model, see model configuration for Gemini.
::: pydantic_ai.models.gemini
For details on how to set up authentication with this model, see model configuration for Mistral.
::: pydantic_ai.models.mistral
For details on how to set up authentication with this model, see model configuration for Anthropic.
::: pydantic_ai.models.anthropic
For details on how to set up authentication with this model, see model configuration for Groq.
::: pydantic_ai.models.groq
A model controlled by a local function.
[FunctionModel
][pydantic_ai.models.function.FunctionModel] is similar to TestModel
,
but allows greater control over the model's behavior.
Its primary use case is for more advanced unit testing than is possible with TestModel
.
Here's a minimal example:
from pydantic_ai import Agent
from pydantic_ai.messages import ModelMessage, ModelResponse
from pydantic_ai.models.function import FunctionModel, AgentInfo
my_agent = Agent('openai:gpt-4o')
async def model_function(
messages: list[ModelMessage], info: AgentInfo
) -> ModelResponse:
print(messages)
"""
[
ModelRequest(
parts=[
UserPromptPart(
content='Testing my agent...',
timestamp=datetime.datetime(...),
part_kind='user-prompt',
)
],
kind='request',
)
]
"""
print(info)
"""
AgentInfo(
function_tools=[], allow_text_result=True, result_tools=[], model_settings=None
)
"""
return ModelResponse.from_text('hello world')
async def test_my_agent():
"""Unit test for my_agent, to be run by pytest."""
with my_agent.override(model=FunctionModel(model_function)):
result = await my_agent.run('Testing my agent...')
assert result.data == 'hello world'
See Unit testing with FunctionModel
for detailed documentation.
::: pydantic_ai.models.function
Utility model for quickly testing apps built with PydanticAI.
Here's a minimal example:
from pydantic_ai import Agent
from pydantic_ai.models.test import TestModel
my_agent = Agent('openai:gpt-4o', system_prompt='...')
async def test_my_agent():
"""Unit test for my_agent, to be run by pytest."""
m = TestModel()
with my_agent.override(model=m):
result = await my_agent.run('Testing my agent...')
assert result.data == 'success (no tool calls)'
assert m.agent_model_function_tools == []
See Unit testing with TestModel
for detailed documentation.
::: pydantic_ai.models.test
::: pydantic_ai.models options: members: - KnownModelName - Model - AgentModel - AbstractToolDefinition - StreamTextResponse - StreamStructuredResponse - ALLOW_MODEL_REQUESTS - check_allow_model_requests - override_allow_model_requests
For details on how to set up authentication with this model, see model configuration for Ollama.
With ollama
installed, you can run the server with the model you want to use:
ollama run llama3.2
(this will pull the llama3.2
model if you don't already have it downloaded)
Then run your code, here's a minimal example:
from pydantic import BaseModel
from pydantic_ai import Agent
class CityLocation(BaseModel):
city: str
country: str
agent = Agent('ollama:llama3.2', result_type=CityLocation)
result = agent.run_sync('Where were the olympics held in 2012?')
print(result.data)
#> city='London' country='United Kingdom'
print(result.usage())
"""
Usage(requests=1, request_tokens=57, response_tokens=8, total_tokens=65, details=None)
"""
from pydantic import BaseModel
from pydantic_ai import Agent
from pydantic_ai.models.ollama import OllamaModel
ollama_model = OllamaModel(
model_name='qwen2.5-coder:7b', # (1)!
base_url='http://192.168.1.74:11434/v1', # (2)!
)
class CityLocation(BaseModel):
city: str
country: str
agent = Agent(model=ollama_model, result_type=CityLocation)
result = agent.run_sync('Where were the olympics held in 2012?')
print(result.data)
#> city='London' country='United Kingdom'
print(result.usage())
"""
Usage(requests=1, request_tokens=57, response_tokens=8, total_tokens=65, details=None)
"""
- The name of the model running on the remote server
- The url of the remote server
See [OllamaModel
][pydantic_ai.models.ollama.OllamaModel] for more information
::: pydantic_ai.models.ollama
Simple chat app example build with FastAPI.
Demonstrates:
This demonstrates storing chat history between requests and using it to give the model context for new responses.
Most of the complex logic here is between chat_app.py
which streams the response to the browser,
and chat_app.ts
which renders messages in the browser.
With dependencies installed and environment variables set, run:
python/uv-run -m pydantic_ai_examples.chat_app
Then open the app at localhost:8000.
TODO screenshot.
Python code that runs the chat app:
#! examples/pydantic_ai_examples/chat_app.py
Simple HTML page to render the app:
#! examples/pydantic_ai_examples/chat_app.html
TypeScript to handle rendering the messages, to keep this simple (and at the risk of offending frontend developers) the typescript code is passed to the browser as plain text and transpiled in the browser.
#! examples/pydantic_ai_examples/chat_app.ts
Small but complete example of using PydanticAI to build a support agent for a bank.
Demonstrates:
With dependencies installed and environment variables set, run:
python/uv-run -m pydantic_ai_examples.bank_support
(or PYDANTIC_AI_MODEL=gemini-1.5-flash ...
)
#! examples/pydantic_ai_examples/bank_support.py
Example of PydanticAI with multiple tools which the LLM needs to call in turn to answer a question.
Demonstrates:
In this case the idea is a "weather" agent — the user can ask for the weather in multiple locations,
the agent will use the get_lat_lng
tool to get the latitude and longitude of the locations, then use
the get_weather
tool to get the weather for those locations.
To run this example properly, you might want to add two extra API keys (Note if either key is missing, the code will fall back to dummy data, so they're not required):
- A weather API key from tomorrow.io set via
WEATHER_API_KEY
- A geocoding API key from geocode.maps.co set via
GEO_API_KEY
With dependencies installed and environment variables set, run:
python/uv-run -m pydantic_ai_examples.weather_agent
#! examples/pydantic_ai_examples/weather_agent.py
Example demonstrating how to use PydanticAI to generate SQL queries based on user input.
Demonstrates:
The resulting SQL is validated by running it as an EXPLAIN
query on PostgreSQL. To run the example, you first need to run PostgreSQL, e.g. via Docker:
docker run --rm -e POSTGRES_PASSWORD=postgres -p 54320:5432 postgres
(we run postgres on port 54320
to avoid conflicts with any other postgres instances you may have running)
With dependencies installed and environment variables set, run:
python/uv-run -m pydantic_ai_examples.sql_gen
or to use a custom prompt:
python/uv-run -m pydantic_ai_examples.sql_gen "find me errors"
This model uses gemini-1.5-flash
by default since Gemini is good at single shot queries of this kind.
#! examples/pydantic_ai_examples/sql_gen.py
Example of a multi-agent flow where one agent delegates work to another, then hands off control to a third agent.
Demonstrates:
In this scenario, a group of agents work together to find the best flight for a user.
The control flow for this example can be summarised as follows:
graph TD
START --> search_agent("search agent")
search_agent --> extraction_agent("extraction agent")
extraction_agent --> search_agent
search_agent --> human_confirm("human confirm")
human_confirm --> search_agent
search_agent --> FAILED
human_confirm --> find_seat_function("find seat function")
find_seat_function --> human_seat_choice("human seat choice")
human_seat_choice --> find_seat_agent("find seat agent")
find_seat_agent --> find_seat_function
find_seat_function --> buy_flights("buy flights")
buy_flights --> SUCCESS
With dependencies installed and environment variables set, run:
python/uv-run -m pydantic_ai_examples.flight_booking
#! examples/pydantic_ai_examples/flight_booking.py
Information about whales — an example of streamed structured response validation.
Demonstrates:
This script streams structured responses from GPT-4 about whales, validates the data
and displays it as a dynamic table using rich
as the data is received.
With dependencies installed and environment variables set, run:
python/uv-run -m pydantic_ai_examples.stream_whales
Should give an output like this:
{{ video('53dd5e7664c20ae90ed90ae42f606bf3', 25) }}
#! examples/pydantic_ai_examples/stream_whales.py
Examples of how to use PydanticAI and what it can do.
These examples are distributed with pydantic-ai
so you can run them either by cloning the pydantic-ai repo or by simply installing pydantic-ai
from PyPI with pip
or uv
.
Either way you'll need to install extra dependencies to run some examples, you just need to install the examples
optional dependency group.
If you've installed pydantic-ai
via pip/uv, you can install the extra dependencies with:
pip/uv-add 'pydantic-ai[examples]'
If you clone the repo, you should instead use uv sync --extra examples
to install extra dependencies.
These examples will need you to set up authentication with one or more of the LLMs, see the model configuration docs for details on how to do this.
TL;DR: in most cases you'll need to set one of the following environment variables:
=== "OpenAI"
```bash
export OPENAI_API_KEY=your-api-key
```
=== "Google Gemini"
```bash
export GEMINI_API_KEY=your-api-key
```
To run the examples (this will work whether you installed pydantic_ai
, or cloned the repo), run:
python/uv-run -m pydantic_ai_examples.<example_module_name>
For examples, to run the very simple pydantic_model
example:
python/uv-run -m pydantic_ai_examples.pydantic_model
If you like one-liners and you're using uv, you can run a pydantic-ai example with zero setup:
OPENAI_API_KEY='your-api-key' \
uv run --with 'pydantic-ai[examples]' \
-m pydantic_ai_examples.pydantic_model
You'll probably want to edit examples in addition to just running them. You can copy the examples to a new directory with:
python/uv-run -m pydantic_ai_examples --copy-to examples/
Simple example of using PydanticAI to construct a Pydantic model from a text input.
Demonstrates:
With dependencies installed and environment variables set, run:
python/uv-run -m pydantic_ai_examples.pydantic_model
This examples uses openai:gpt-4o
by default, but it works well with other models, e.g. you can run it
with Gemini using:
PYDANTIC_AI_MODEL=gemini-1.5-pro python/uv-run -m pydantic_ai_examples.pydantic_model
(or PYDANTIC_AI_MODEL=gemini-1.5-flash ...
)
#! examples/pydantic_ai_examples/pydantic_model.py
RAG search example. This demo allows you to ask question of the logfire documentation.
Demonstrates:
- tools
- agent dependencies
- RAG search
This is done by creating a database containing each section of the markdown documentation, then registering the search tool with the PydanticAI agent.
Logic for extracting sections from markdown files and a JSON file with that data is available in this gist.
PostgreSQL with pgvector is used as the search database, the easiest way to download and run pgvector is using Docker:
mkdir postgres-data
docker run --rm \
-e POSTGRES_PASSWORD=postgres \
-p 54320:5432 \
-v `pwd`/postgres-data:/var/lib/postgresql/data \
pgvector/pgvector:pg17
As with the SQL gen example, we run postgres on port 54320
to avoid conflicts with any other postgres instances you may have running.
We also mount the PostgreSQL data
directory locally to persist the data if you need to stop and restart the container.
With that running and dependencies installed and environment variables set, we can build the search database with (WARNING: this requires the OPENAI_API_KEY
env variable and will calling the OpenAI embedding API around 300 times to generate embeddings for each section of the documentation):
python/uv-run -m pydantic_ai_examples.rag build
(Note building the database doesn't use PydanticAI right now, instead it uses the OpenAI SDK directly.)
You can then ask the agent a question with:
python/uv-run -m pydantic_ai_examples.rag search "How do I configure logfire to work with FastAPI?"
#! examples/pydantic_ai_examples/rag.py
This example shows how to stream markdown from an agent, using the rich
library to highlight the output in the terminal.
It'll run the example with both OpenAI and Google Gemini models if the required environment variables are set.
Demonstrates:
With dependencies installed and environment variables set, run:
python/uv-run -m pydantic_ai_examples.stream_markdown
#! examples/pydantic_ai_examples/stream_markdown.py