Skip to content

Instantly share code, notes, and snippets.

@szeyu
Forked from erikj27/pydantic-ai_summary.md
Created February 28, 2025 05:22
Show Gist options
  • Save szeyu/e7c92f16ea721ba9ffbe472a127006f1 to your computer and use it in GitHub Desktop.
Save szeyu/e7c92f16ea721ba9ffbe472a127006f1 to your computer and use it in GitHub Desktop.
Documentation for pydantic-ai

Repository Overview

Directory Structure

pydantic-ai
├── LICENSE
├── Makefile
├── README.md
├── docs
│   ├── _worker.js
│   ├── agents.md
│   ├── api
│   │   ├── agent.md
│   │   ├── exceptions.md
│   │   ├── format_as_xml.md
│   │   ├── messages.md
│   │   ├── models
│   │   │   ├── anthropic.md
│   │   │   ├── base.md
│   │   │   ├── function.md
│   │   │   ├── gemini.md
│   │   │   ├── groq.md
│   │   │   ├── mistral.md
│   │   │   ├── ollama.md
│   │   │   ├── openai.md
│   │   │   ├── test.md
│   │   │   └── vertexai.md
│   │   ├── result.md
│   │   ├── settings.md
│   │   ├── tools.md
│   │   └── usage.md
│   ├── contributing.md
│   ├── dependencies.md
│   ├── examples
│   │   ├── bank-support.md
│   │   ├── chat-app.md
│   │   ├── flight-booking.md
│   │   ├── index.md
│   │   ├── pydantic-model.md
│   │   ├── rag.md
│   │   ├── sql-gen.md
│   │   ├── stream-markdown.md
│   │   ├── stream-whales.md
│   │   └── weather-agent.md
│   ├── extra
│   │   └── tweaks.css
│   ├── favicon.ico
│   ├── help.md
│   ├── img
│   │   ├── logfire-monitoring-pydanticai.png
│   │   ├── logfire-weather-agent.png
│   │   ├── logo-white.svg
│   │   ├── pydantic-ai-dark.svg
│   │   └── pydantic-ai-light.svg
│   ├── index.md
│   ├── install.md
│   ├── logfire.md
│   ├── message-history.md
│   ├── models.md
│   ├── multi-agent-applications.md
│   ├── results.md
│   ├── testing-evals.md
│   ├── tools.md
│   └── troubleshooting.md
├── examples
│   ├── README.md
│   ├── pydantic_ai_examples
│   │   ├── __main__.py
│   │   ├── bank_support.py
│   │   ├── chat_app.html
│   │   ├── chat_app.py
│   │   ├── chat_app.ts
│   │   ├── flight_booking.py
│   │   ├── pydantic_model.py
│   │   ├── rag.py
│   │   ├── roulette_wheel.py
│   │   ├── sql_gen.py
│   │   ├── stream_markdown.py
│   │   ├── stream_whales.py
│   │   └── weather_agent.py
│   └── pyproject.toml
├── mkdocs.insiders.yml
├── mkdocs.yml
├── pydantic_ai_slim
│   ├── README.md
│   ├── pydantic_ai
│   │   ├── __init__.py
│   │   ├── _griffe.py
│   │   ├── _pydantic.py
│   │   ├── _result.py
│   │   ├── _system_prompt.py
│   │   ├── _utils.py
│   │   ├── agent.py
│   │   ├── exceptions.py
│   │   ├── format_as_xml.py
│   │   ├── messages.py
│   │   ├── models
│   │   │   ├── __init__.py
│   │   │   ├── anthropic.py
│   │   │   ├── function.py
│   │   │   ├── gemini.py
│   │   │   ├── groq.py
│   │   │   ├── mistral.py
│   │   │   ├── ollama.py
│   │   │   ├── openai.py
│   │   │   ├── test.py
│   │   │   └── vertexai.py
│   │   ├── py.typed
│   │   ├── result.py
│   │   ├── settings.py
│   │   ├── tools.py
│   │   └── usage.py
│   └── pyproject.toml
├── pyproject.toml
├── requirements.txt
├── tests
│   ├── __init__.py
│   ├── conftest.py
│   ├── example_modules
│   │   ├── README.md
│   │   ├── bank_database.py
│   │   ├── fake_database.py
│   │   └── weather_service.py
│   ├── import_examples.py
│   ├── models
│   │   ├── __init__.py
│   │   ├── test_anthropic.py
│   │   ├── test_gemini.py
│   │   ├── test_groq.py
│   │   ├── test_mistral.py
│   │   ├── test_model.py
│   │   ├── test_model_function.py
│   │   ├── test_model_test.py
│   │   ├── test_ollama.py
│   │   ├── test_openai.py
│   │   └── test_vertexai.py
│   ├── test_agent.py
│   ├── test_deps.py
│   ├── test_examples.py
│   ├── test_format_as_xml.py
│   ├── test_live.py
│   ├── test_logfire.py
│   ├── test_streaming.py
│   ├── test_tools.py
│   ├── test_usage_limits.py
│   ├── test_utils.py
│   └── typed_agent.py
├── uprev.py
└── uv.lock

README Files

output/repo_parser/github_repos/pydantic/pydantic-ai/README.md

Agent Framework / shim to use Pydantic with LLMs
CI Coverage PyPI versions license

Documentation: ai.pydantic.dev


PydanticAI is a Python agent framework designed to make it less painful to build production grade applications with Generative AI.

FastAPI revolutionized web development by offering an innovative and ergonomic design, built on the foundation of Pydantic.

Similarly, virtually every agent framework and LLM library in Python uses Pydantic, yet when we began to use LLMs in Pydantic Logfire, we couldn't find anything that gave us the same feeling.

We built PydanticAI with one simple aim: to bring that FastAPI feeling to GenAI app development.

Why use PydanticAI

  • Built by the Pydantic Team Built by the team behind Pydantic (the validation layer of the OpenAI SDK, the Anthropic SDK, LangChain, LlamaIndex, AutoGPT, Transformers, CrewAI, Instructor and many more).

  • Model-agnostic Supports OpenAI, Anthropic, Gemini, Ollama, Groq, and Mistral, and there is a simple interface to implement support for other models.

  • Pydantic Logfire Integration Seamlessly integrates with Pydantic Logfire for real-time debugging, performance monitoring, and behavior tracking of your LLM-powered applications.

  • Type-safe Designed to make type checking as useful as possible for you, so it integrates well with static type checkers, like mypy and pyright.

  • Python-centric Design Leverages Python’s familiar control flow and agent composition to build your AI-driven projects, making it easy to apply standard Python best practices you'd use in any other (non-AI) project

  • Structured Responses Harnesses the power of Pydantic to validate and structure model outputs, ensuring responses are consistent across runs.

  • Dependency Injection System Offers an optional dependency injection system to provide data and services to your agent's system prompts, tools and result validators. This is useful for testing and eval-driven iterative development.

  • Streamed Responses Provides the ability to stream LLM outputs continuously, with immediate validation, ensuring rapid and accurate results.

In Beta!

PydanticAI is in early beta, the API is still subject to change and there's a lot more to do. Feedback is very welcome!

Hello World Example

Here's a minimal example of PydanticAI:

from pydantic_ai import Agent

# Define a very simple agent including the model to use, you can also set the model when running the agent.
agent = Agent(
    'gemini-1.5-flash',
    # Register a static system prompt using a keyword argument to the agent.
    # For more complex dynamically-generated system prompts, see the example below.
    system_prompt='Be concise, reply with one sentence.',
)

# Run the agent synchronously, conducting a conversation with the LLM.
# Here the exchange should be very short: PydanticAI will send the system prompt and the user query to the LLM,
# the model will return a text response. See below for a more complex run.
result = agent.run_sync('Where does "hello world" come from?')
print(result.data)
"""
The first known use of "hello, world" was in a 1974 textbook about the C programming language.
"""

(This example is complete, it can be run "as is")

Not very interesting yet, but we can easily add "tools", dynamic system prompts, and structured responses to build more powerful agents.

Tools & Dependency Injection Example

Here is a concise example using PydanticAI to build a support agent for a bank:

(Better documented example in the docs)

from dataclasses import dataclass

from pydantic import BaseModel, Field
from pydantic_ai import Agent, RunContext

from bank_database import DatabaseConn


# SupportDependencies is used to pass data, connections, and logic into the model that will be needed when running
# system prompt and tool functions. Dependency injection provides a type-safe way to customise the behavior of your agents.
@dataclass
class SupportDependencies:
    customer_id: int
    db: DatabaseConn


# This pydantic model defines the structure of the result returned by the agent.
class SupportResult(BaseModel):
    support_advice: str = Field(description='Advice returned to the customer')
    block_card: bool = Field(description="Whether to block the customer's card")
    risk: int = Field(description='Risk level of query', ge=0, le=10)


# This agent will act as first-tier support in a bank.
# Agents are generic in the type of dependencies they accept and the type of result they return.
# In this case, the support agent has type `Agent[SupportDependencies, SupportResult]`.
support_agent = Agent(
    'openai:gpt-4o',
    deps_type=SupportDependencies,
    # The response from the agent will, be guaranteed to be a SupportResult,
    # if validation fails the agent is prompted to try again.
    result_type=SupportResult,
    system_prompt=(
        'You are a support agent in our bank, give the '
        'customer support and judge the risk level of their query.'
    ),
)


# Dynamic system prompts can make use of dependency injection.
# Dependencies are carried via the `RunContext` argument, which is parameterized with the `deps_type` from above.
# If the type annotation here is wrong, static type checkers will catch it.
@support_agent.system_prompt
async def add_customer_name(ctx: RunContext[SupportDependencies]) -> str:
    customer_name = await ctx.deps.db.customer_name(id=ctx.deps.customer_id)
    return f"The customer's name is {customer_name!r}"


# `tool` let you register functions which the LLM may call while responding to a user.
# Again, dependencies are carried via `RunContext`, any other arguments become the tool schema passed to the LLM.
# Pydantic is used to validate these arguments, and errors are passed back to the LLM so it can retry.
@support_agent.tool
async def customer_balance(
    ctx: RunContext[SupportDependencies], include_pending: bool
) -> float:
    """Returns the customer's current account balance."""
    # The docstring of a tool is also passed to the LLM as the description of the tool.
    # Parameter descriptions are extracted from the docstring and added to the parameter schema sent to the LLM.
    balance = await ctx.deps.db.customer_balance(
        id=ctx.deps.customer_id,
        include_pending=include_pending,
    )
    return balance


...  # In a real use case, you'd add more tools and a longer system prompt


async def main():
    deps = SupportDependencies(customer_id=123, db=DatabaseConn())
    # Run the agent asynchronously, conducting a conversation with the LLM until a final response is reached.
    # Even in this fairly simple case, the agent will exchange multiple messages with the LLM as tools are called to retrieve a result.
    result = await support_agent.run('What is my balance?', deps=deps)
    # The result will be validated with Pydantic to guarantee it is a `SupportResult`, since the agent is generic,
    # it'll also be typed as a `SupportResult` to aid with static type checking.
    print(result.data)
    """
    support_advice='Hello John, your current account balance, including pending transactions, is $123.45.' block_card=False risk=1
    """

    result = await support_agent.run('I just lost my card!', deps=deps)
    print(result.data)
    """
    support_advice="I'm sorry to hear that, John. We are temporarily blocking your card to prevent unauthorized transactions." block_card=True risk=8
    """

Next Steps

To try PydanticAI yourself, follow the instructions in the examples.

Read the docs to learn more about building applications with PydanticAI.

Read the API Reference to understand PydanticAI's interface.

output/repo_parser/github_repos/pydantic/pydantic-ai/pydantic_ai_slim/README.md

PydanticAI Slim

CI Coverage PyPI versions license

PydanticAI core logic with minimal required dependencies.

For more information on how to use this package see ai.pydantic.dev/install.

output/repo_parser/github_repos/pydantic/pydantic-ai/tests/example_modules/README.md

docs examples imports

This directory is added to sys.path in tests/test_examples.py::test_docs_examples to augment some of the examples.

output/repo_parser/github_repos/pydantic/pydantic-ai/examples/README.md

PydanticAI Examples

CI Coverage PyPI versions license

Examples of how to use PydanticAI and what it can do.

For full documentation of these examples and how to run them, see ai.pydantic.dev/examples/.

Documentation

output/repo_parser/github_repos/pydantic/pydantic-ai/docs/install.md

Installation

PydanticAI is available on PyPI as pydantic-ai so installation is as simple as:

pip/uv-add pydantic-ai

(Requires Python 3.9+)

This installs the pydantic_ai package, core dependencies, and libraries required to use all the models included in PydanticAI. If you want to use a specific model, you can install the "slim" version of PydanticAI.

Use with Pydantic Logfire

PydanticAI has an excellent (but completely optional) integration with Pydantic Logfire to help you view and understand agent runs.

To use Logfire with PydanticAI, install pydantic-ai or pydantic-ai-slim with the logfire optional group:

pip/uv-add 'pydantic-ai[logfire]'

From there, follow the Logfire setup docs to configure Logfire.

Running Examples

We distribute the pydantic_ai_examples directory as a separate PyPI package (pydantic-ai-examples) to make examples extremely easy to customize and run.

To install examples, use the examples optional group:

pip/uv-add 'pydantic-ai[examples]'

To run the examples, follow instructions in the examples docs.

Slim Install

If you know which model you're going to use and want to avoid installing superfluous packages, you can use the pydantic-ai-slim package. For example, if you're using just [OpenAIModel][pydantic_ai.models.openai.OpenAIModel], you would run:

pip/uv-add 'pydantic-ai-slim[openai]'

See the models documentation for information on which optional dependencies are required for each model.

You can also install dependencies for multiple models and use cases, for example:

pip/uv-add 'pydantic-ai-slim[openai,vertexai,logfire]'

output/repo_parser/github_repos/pydantic/pydantic-ai/docs/tools.md

Function Tools

Function tools provide a mechanism for models to retrieve extra information to help them generate a response.

They're useful when it is impractical or impossible to put all the context an agent might need into the system prompt, or when you want to make agents' behavior more deterministic or reliable by deferring some of the logic required to generate a response to another (not necessarily AI-powered) tool.

!!! info "Function tools vs. RAG" Function tools are basically the "R" of RAG (Retrieval-Augmented Generation) — they augment what the model can do by letting it request extra information.

The main semantic difference between PydanticAI Tools and RAG is RAG is synonymous with vector search, while PydanticAI tools are more general-purpose. (Note: we may add support for vector search functionality in the future, particularly an API for generating embeddings. See [#58](https://github.com/pydantic/pydantic-ai/issues/58))

There are a number of ways to register tools with an agent:

  • via the [@agent.tool][pydantic_ai.Agent.tool] decorator — for tools that need access to the agent [context][pydantic_ai.tools.RunContext]
  • via the [@agent.tool_plain][pydantic_ai.Agent.tool_plain] decorator — for tools that do not need access to the agent [context][pydantic_ai.tools.RunContext]
  • via the [tools][pydantic_ai.Agent.init] keyword argument to Agent which can take either plain functions, or instances of [Tool][pydantic_ai.tools.Tool]

@agent.tool is considered the default decorator since in the majority of cases tools will need access to the agent context.

Here's an example using both:

import random

from pydantic_ai import Agent, RunContext

agent = Agent(
    'gemini-1.5-flash',  # (1)!
    deps_type=str,  # (2)!
    system_prompt=(
        "You're a dice game, you should roll the die and see if the number "
        "you get back matches the user's guess. If so, tell them they're a winner. "
        "Use the player's name in the response."
    ),
)


@agent.tool_plain  # (3)!
def roll_die() -> str:
    """Roll a six-sided die and return the result."""
    return str(random.randint(1, 6))


@agent.tool  # (4)!
def get_player_name(ctx: RunContext[str]) -> str:
    """Get the player's name."""
    return ctx.deps


dice_result = agent.run_sync('My guess is 4', deps='Anne')  # (5)!
print(dice_result.data)
#> Congratulations Anne, you guessed correctly! You're a winner!
  1. This is a pretty simple task, so we can use the fast and cheap Gemini flash model.
  2. We pass the user's name as the dependency, to keep things simple we use just the name as a string as the dependency.
  3. This tool doesn't need any context, it just returns a random number. You could probably use a dynamic system prompt in this case.
  4. This tool needs the player's name, so it uses RunContext to access dependencies which are just the player's name in this case.
  5. Run the agent, passing the player's name as the dependency.

(This example is complete, it can be run "as is")

Let's print the messages from that game to see what happened:

from dice_game import dice_result

print(dice_result.all_messages())
"""
[
    ModelRequest(
        parts=[
            SystemPromptPart(
                content="You're a dice game, you should roll the die and see if the number you get back matches the user's guess. If so, tell them they're a winner. Use the player's name in the response.",
                part_kind='system-prompt',
            ),
            UserPromptPart(
                content='My guess is 4',
                timestamp=datetime.datetime(...),
                part_kind='user-prompt',
            ),
        ],
        kind='request',
    ),
    ModelResponse(
        parts=[
            ToolCallPart(
                tool_name='roll_die',
                args=ArgsDict(args_dict={}),
                tool_call_id=None,
                part_kind='tool-call',
            )
        ],
        timestamp=datetime.datetime(...),
        kind='response',
    ),
    ModelRequest(
        parts=[
            ToolReturnPart(
                tool_name='roll_die',
                content='4',
                tool_call_id=None,
                timestamp=datetime.datetime(...),
                part_kind='tool-return',
            )
        ],
        kind='request',
    ),
    ModelResponse(
        parts=[
            ToolCallPart(
                tool_name='get_player_name',
                args=ArgsDict(args_dict={}),
                tool_call_id=None,
                part_kind='tool-call',
            )
        ],
        timestamp=datetime.datetime(...),
        kind='response',
    ),
    ModelRequest(
        parts=[
            ToolReturnPart(
                tool_name='get_player_name',
                content='Anne',
                tool_call_id=None,
                timestamp=datetime.datetime(...),
                part_kind='tool-return',
            )
        ],
        kind='request',
    ),
    ModelResponse(
        parts=[
            TextPart(
                content="Congratulations Anne, you guessed correctly! You're a winner!",
                part_kind='text',
            )
        ],
        timestamp=datetime.datetime(...),
        kind='response',
    ),
]
"""

We can represent this with a diagram:

sequenceDiagram
    participant Agent
    participant LLM

    Note over Agent: Send prompts
    Agent ->> LLM: System: "You're a dice game..."<br>User: "My guess is 4"
    activate LLM
    Note over LLM: LLM decides to use<br>a tool

    LLM ->> Agent: Call tool<br>roll_die()
    deactivate LLM
    activate Agent
    Note over Agent: Rolls a six-sided die

    Agent -->> LLM: ToolReturn<br>"4"
    deactivate Agent
    activate LLM
    Note over LLM: LLM decides to use<br>another tool

    LLM ->> Agent: Call tool<br>get_player_name()
    deactivate LLM
    activate Agent
    Note over Agent: Retrieves player name
    Agent -->> LLM: ToolReturn<br>"Anne"
    deactivate Agent
    activate LLM
    Note over LLM: LLM constructs final response

    LLM ->> Agent: ModelResponse<br>"Congratulations Anne, ..."
    deactivate LLM
    Note over Agent: Game session complete
Loading

Registering Function Tools via kwarg

As well as using the decorators, we can register tools via the tools argument to the [Agent constructor][pydantic_ai.Agent.init]. This is useful when you want to re-use tools, and can also give more fine-grained control over the tools.

import random

from pydantic_ai import Agent, RunContext, Tool


def roll_die() -> str:
    """Roll a six-sided die and return the result."""
    return str(random.randint(1, 6))


def get_player_name(ctx: RunContext[str]) -> str:
    """Get the player's name."""
    return ctx.deps


agent_a = Agent(
    'gemini-1.5-flash',
    deps_type=str,
    tools=[roll_die, get_player_name],  # (1)!
)
agent_b = Agent(
    'gemini-1.5-flash',
    deps_type=str,
    tools=[  # (2)!
        Tool(roll_die, takes_ctx=False),
        Tool(get_player_name, takes_ctx=True),
    ],
)
dice_result = agent_b.run_sync('My guess is 4', deps='Anne')
print(dice_result.data)
#> Congratulations Anne, you guessed correctly! You're a winner!
  1. The simplest way to register tools via the Agent constructor is to pass a list of functions, the function signature is inspected to determine if the tool takes [RunContext][pydantic_ai.tools.RunContext].
  2. agent_a and agent_b are identical — but we can use [Tool][pydantic_ai.tools.Tool] to reuse tool definitions and give more fine-grained control over how tools are defined, e.g. setting their name or description, or using a custom prepare method.

(This example is complete, it can be run "as is")

Function Tools vs. Structured Results

As the name suggests, function tools use the model's "tools" or "functions" API to let the model know what is available to call. Tools or functions are also used to define the schema(s) for structured responses, thus a model might have access to many tools, some of which call function tools while others end the run and return a result.

Function tools and schema

Function parameters are extracted from the function signature, and all parameters except RunContext are used to build the schema for that tool call.

Even better, PydanticAI extracts the docstring from functions and (thanks to griffe) extracts parameter descriptions from the docstring and adds them to the schema.

Griffe supports extracting parameter descriptions from google, numpy and sphinx style docstrings, and PydanticAI will infer the format to use based on the docstring. We plan to add support in the future to explicitly set the style to use, and warn/error if not all parameters are documented; see #59.

To demonstrate a tool's schema, here we use [FunctionModel][pydantic_ai.models.function.FunctionModel] to print the schema a model would receive:

from pydantic_ai import Agent
from pydantic_ai.messages import ModelMessage, ModelResponse
from pydantic_ai.models.function import AgentInfo, FunctionModel

agent = Agent()


@agent.tool_plain
def foobar(a: int, b: str, c: dict[str, list[float]]) -> str:
    """Get me foobar.

    Args:
        a: apple pie
        b: banana cake
        c: carrot smoothie
    """
    return f'{a} {b} {c}'


def print_schema(messages: list[ModelMessage], info: AgentInfo) -> ModelResponse:
    tool = info.function_tools[0]
    print(tool.description)
    #> Get me foobar.
    print(tool.parameters_json_schema)
    """
    {
        'properties': {
            'a': {'description': 'apple pie', 'title': 'A', 'type': 'integer'},
            'b': {'description': 'banana cake', 'title': 'B', 'type': 'string'},
            'c': {
                'additionalProperties': {'items': {'type': 'number'}, 'type': 'array'},
                'description': 'carrot smoothie',
                'title': 'C',
                'type': 'object',
            },
        },
        'required': ['a', 'b', 'c'],
        'type': 'object',
        'additionalProperties': False,
    }
    """
    return ModelResponse.from_text(content='foobar')


agent.run_sync('hello', model=FunctionModel(print_schema))

(This example is complete, it can be run "as is")

The return type of tool can be anything which Pydantic can serialize to JSON as some models (e.g. Gemini) support semi-structured return values, some expect text (OpenAI) but seem to be just as good at extracting meaning from the data. If a Python object is returned and the model expects a string, the value will be serialized to JSON.

If a tool has a single parameter that can be represented as an object in JSON schema (e.g. dataclass, TypedDict, pydantic model), the schema for the tool is simplified to be just that object.

Here's an example, we use [TestModel.agent_model_function_tools][pydantic_ai.models.test.TestModel.agent_model_function_tools] to inspect the tool schema that would be passed to the model.

from pydantic import BaseModel

from pydantic_ai import Agent
from pydantic_ai.models.test import TestModel

agent = Agent()


class Foobar(BaseModel):
    """This is a Foobar"""

    x: int
    y: str
    z: float = 3.14


@agent.tool_plain
def foobar(f: Foobar) -> str:
    return str(f)


test_model = TestModel()
result = agent.run_sync('hello', model=test_model)
print(result.data)
#> {"foobar":"x=0 y='a' z=3.14"}
print(test_model.agent_model_function_tools)
"""
[
    ToolDefinition(
        name='foobar',
        description='This is a Foobar',
        parameters_json_schema={
            'properties': {
                'x': {'title': 'X', 'type': 'integer'},
                'y': {'title': 'Y', 'type': 'string'},
                'z': {'default': 3.14, 'title': 'Z', 'type': 'number'},
            },
            'required': ['x', 'y'],
            'title': 'Foobar',
            'type': 'object',
        },
        outer_typed_dict_key=None,
    )
]
"""

(This example is complete, it can be run "as is")

Dynamic Function tools {#tool-prepare}

Tools can optionally be defined with another function: prepare, which is called at each step of a run to customize the definition of the tool passed to the model, or omit the tool completely from that step.

A prepare method can be registered via the prepare kwarg to any of the tool registration mechanisms:

  • [@agent.tool][pydantic_ai.Agent.tool] decorator
  • [@agent.tool_plain][pydantic_ai.Agent.tool_plain] decorator
  • [Tool][pydantic_ai.tools.Tool] dataclass

The prepare method, should be of type [ToolPrepareFunc][pydantic_ai.tools.ToolPrepareFunc], a function which takes [RunContext][pydantic_ai.tools.RunContext] and a pre-built [ToolDefinition][pydantic_ai.tools.ToolDefinition], and should either return that ToolDefinition with or without modifying it, return a new ToolDefinition, or return None to indicate this tools should not be registered for that step.

Here's a simple prepare method that only includes the tool if the value of the dependency is 42.

As with the previous example, we use [TestModel][pydantic_ai.models.test.TestModel] to demonstrate the behavior without calling a real model.

from typing import Union

from pydantic_ai import Agent, RunContext
from pydantic_ai.tools import ToolDefinition

agent = Agent('test')


async def only_if_42(
    ctx: RunContext[int], tool_def: ToolDefinition
) -> Union[ToolDefinition, None]:
    if ctx.deps == 42:
        return tool_def


@agent.tool(prepare=only_if_42)
def hitchhiker(ctx: RunContext[int], answer: str) -> str:
    return f'{ctx.deps} {answer}'


result = agent.run_sync('testing...', deps=41)
print(result.data)
#> success (no tool calls)
result = agent.run_sync('testing...', deps=42)
print(result.data)
#> {"hitchhiker":"42 a"}

(This example is complete, it can be run "as is")

Here's a more complex example where we change the description of the name parameter to based on the value of deps

For the sake of variation, we create this tool using the [Tool][pydantic_ai.tools.Tool] dataclass.

from __future__ import annotations

from typing import Literal

from pydantic_ai import Agent, RunContext
from pydantic_ai.models.test import TestModel
from pydantic_ai.tools import Tool, ToolDefinition


def greet(name: str) -> str:
    return f'hello {name}'


async def prepare_greet(
    ctx: RunContext[Literal['human', 'machine']], tool_def: ToolDefinition
) -> ToolDefinition | None:
    d = f'Name of the {ctx.deps} to greet.'
    tool_def.parameters_json_schema['properties']['name']['description'] = d
    return tool_def


greet_tool = Tool(greet, prepare=prepare_greet)
test_model = TestModel()
agent = Agent(test_model, tools=[greet_tool], deps_type=Literal['human', 'machine'])

result = agent.run_sync('testing...', deps='human')
print(result.data)
#> {"greet":"hello a"}
print(test_model.agent_model_function_tools)
"""
[
    ToolDefinition(
        name='greet',
        description='',
        parameters_json_schema={
            'properties': {
                'name': {
                    'title': 'Name',
                    'type': 'string',
                    'description': 'Name of the human to greet.',
                }
            },
            'required': ['name'],
            'type': 'object',
            'additionalProperties': False,
        },
        outer_typed_dict_key=None,
    )
]
"""

(This example is complete, it can be run "as is")

output/repo_parser/github_repos/pydantic/pydantic-ai/docs/dependencies.md

Dependencies

PydanticAI uses a dependency injection system to provide data and services to your agent's system prompts, tools and result validators.

Matching PydanticAI's design philosophy, our dependency system tries to use existing best practice in Python development rather than inventing esoteric "magic", this should make dependencies type-safe, understandable easier to test and ultimately easier to deploy in production.

Defining Dependencies

Dependencies can be any python type. While in simple cases you might be able to pass a single object as a dependency (e.g. an HTTP connection), [dataclasses][] are generally a convenient container when your dependencies included multiple objects.

Here's an example of defining an agent that requires dependencies.

(Note: dependencies aren't actually used in this example, see Accessing Dependencies below)

from dataclasses import dataclass

import httpx

from pydantic_ai import Agent


@dataclass
class MyDeps:  # (1)!
    api_key: str
    http_client: httpx.AsyncClient


agent = Agent(
    'openai:gpt-4o',
    deps_type=MyDeps,  # (2)!
)


async def main():
    async with httpx.AsyncClient() as client:
        deps = MyDeps('foobar', client)
        result = await agent.run(
            'Tell me a joke.',
            deps=deps,  # (3)!
        )
        print(result.data)
        #> Did you hear about the toothpaste scandal? They called it Colgate.
  1. Define a dataclass to hold dependencies.
  2. Pass the dataclass type to the deps_type argument of the [Agent constructor][pydantic_ai.Agent.init]. Note: we're passing the type here, NOT an instance, this parameter is not actually used at runtime, it's here so we can get full type checking of the agent.
  3. When running the agent, pass an instance of the dataclass to the deps parameter.

(This example is complete, it can be run "as is")

Accessing Dependencies

Dependencies are accessed through the [RunContext][pydantic_ai.tools.RunContext] type, this should be the first parameter of system prompt functions etc.

from dataclasses import dataclass

import httpx

from pydantic_ai import Agent, RunContext


@dataclass
class MyDeps:
    api_key: str
    http_client: httpx.AsyncClient


agent = Agent(
    'openai:gpt-4o',
    deps_type=MyDeps,
)


@agent.system_prompt  # (1)!
async def get_system_prompt(ctx: RunContext[MyDeps]) -> str:  # (2)!
    response = await ctx.deps.http_client.get(  # (3)!
        'https://example.com',
        headers={'Authorization': f'Bearer {ctx.deps.api_key}'},  # (4)!
    )
    response.raise_for_status()
    return f'Prompt: {response.text}'


async def main():
    async with httpx.AsyncClient() as client:
        deps = MyDeps('foobar', client)
        result = await agent.run('Tell me a joke.', deps=deps)
        print(result.data)
        #> Did you hear about the toothpaste scandal? They called it Colgate.
  1. [RunContext][pydantic_ai.tools.RunContext] may optionally be passed to a [system_prompt][pydantic_ai.Agent.system_prompt] function as the only argument.
  2. [RunContext][pydantic_ai.tools.RunContext] is parameterized with the type of the dependencies, if this type is incorrect, static type checkers will raise an error.
  3. Access dependencies through the [.deps][pydantic_ai.tools.RunContext.deps] attribute.
  4. Access dependencies through the [.deps][pydantic_ai.tools.RunContext.deps] attribute.

(This example is complete, it can be run "as is")

Asynchronous vs. Synchronous dependencies

System prompt functions, function tools and result validators are all run in the async context of an agent run.

If these functions are not coroutines (e.g. async def) they are called with [run_in_executor][asyncio.loop.run_in_executor] in a thread pool, it's therefore marginally preferable to use async methods where dependencies perform IO, although synchronous dependencies should work fine too.

!!! note "run vs. run_sync and Asynchronous vs. Synchronous dependencies" Whether you use synchronous or asynchronous dependencies, is completely independent of whether you use run or run_syncrun_sync is just a wrapper around run and agents are always run in an async context.

Here's the same example as above, but with a synchronous dependency:

from dataclasses import dataclass

import httpx

from pydantic_ai import Agent, RunContext


@dataclass
class MyDeps:
    api_key: str
    http_client: httpx.Client  # (1)!


agent = Agent(
    'openai:gpt-4o',
    deps_type=MyDeps,
)


@agent.system_prompt
def get_system_prompt(ctx: RunContext[MyDeps]) -> str:  # (2)!
    response = ctx.deps.http_client.get(
        'https://example.com', headers={'Authorization': f'Bearer {ctx.deps.api_key}'}
    )
    response.raise_for_status()
    return f'Prompt: {response.text}'


async def main():
    deps = MyDeps('foobar', httpx.Client())
    result = await agent.run(
        'Tell me a joke.',
        deps=deps,
    )
    print(result.data)
    #> Did you hear about the toothpaste scandal? They called it Colgate.
  1. Here we use a synchronous httpx.Client instead of an asynchronous httpx.AsyncClient.
  2. To match the synchronous dependency, the system prompt function is now a plain function, not a coroutine.

(This example is complete, it can be run "as is")

Full Example

As well as system prompts, dependencies can be used in tools and result validators.

from dataclasses import dataclass

import httpx

from pydantic_ai import Agent, ModelRetry, RunContext


@dataclass
class MyDeps:
    api_key: str
    http_client: httpx.AsyncClient


agent = Agent(
    'openai:gpt-4o',
    deps_type=MyDeps,
)


@agent.system_prompt
async def get_system_prompt(ctx: RunContext[MyDeps]) -> str:
    response = await ctx.deps.http_client.get('https://example.com')
    response.raise_for_status()
    return f'Prompt: {response.text}'


@agent.tool  # (1)!
async def get_joke_material(ctx: RunContext[MyDeps], subject: str) -> str:
    response = await ctx.deps.http_client.get(
        'https://example.com#jokes',
        params={'subject': subject},
        headers={'Authorization': f'Bearer {ctx.deps.api_key}'},
    )
    response.raise_for_status()
    return response.text


@agent.result_validator  # (2)!
async def validate_result(ctx: RunContext[MyDeps], final_response: str) -> str:
    response = await ctx.deps.http_client.post(
        'https://example.com#validate',
        headers={'Authorization': f'Bearer {ctx.deps.api_key}'},
        params={'query': final_response},
    )
    if response.status_code == 400:
        raise ModelRetry(f'invalid response: {response.text}')
    response.raise_for_status()
    return final_response


async def main():
    async with httpx.AsyncClient() as client:
        deps = MyDeps('foobar', client)
        result = await agent.run('Tell me a joke.', deps=deps)
        print(result.data)
        #> Did you hear about the toothpaste scandal? They called it Colgate.
  1. To pass RunContext to a tool, use the [tool][pydantic_ai.Agent.tool] decorator.
  2. RunContext may optionally be passed to a [result_validator][pydantic_ai.Agent.result_validator] function as the first argument.

(This example is complete, it can be run "as is")

Overriding Dependencies

When testing agents, it's useful to be able to customise dependencies.

While this can sometimes be done by calling the agent directly within unit tests, we can also override dependencies while calling application code which in turn calls the agent.

This is done via the [override][pydantic_ai.Agent.override] method on the agent.

from dataclasses import dataclass

import httpx

from pydantic_ai import Agent, RunContext


@dataclass
class MyDeps:
    api_key: str
    http_client: httpx.AsyncClient

    async def system_prompt_factory(self) -> str:  # (1)!
        response = await self.http_client.get('https://example.com')
        response.raise_for_status()
        return f'Prompt: {response.text}'


joke_agent = Agent('openai:gpt-4o', deps_type=MyDeps)


@joke_agent.system_prompt
async def get_system_prompt(ctx: RunContext[MyDeps]) -> str:
    return await ctx.deps.system_prompt_factory()  # (2)!


async def application_code(prompt: str) -> str:  # (3)!
    ...
    ...
    # now deep within application code we call our agent
    async with httpx.AsyncClient() as client:
        app_deps = MyDeps('foobar', client)
        result = await joke_agent.run(prompt, deps=app_deps)  # (4)!
    return result.data
  1. Define a method on the dependency to make the system prompt easier to customise.
  2. Call the system prompt factory from within the system prompt function.
  3. Application code that calls the agent, in a real application this might be an API endpoint.
  4. Call the agent from within the application code, in a real application this call might be deep within a call stack. Note app_deps here will NOT be used when deps are overridden.

(This example is complete, it can be run "as is")

from joke_app import MyDeps, application_code, joke_agent


class TestMyDeps(MyDeps):  # (1)!
    async def system_prompt_factory(self) -> str:
        return 'test prompt'


async def test_application_code():
    test_deps = TestMyDeps('test_key', None)  # (2)!
    with joke_agent.override(deps=test_deps):  # (3)!
        joke = await application_code('Tell me a joke.')  # (4)!
    assert joke.startswith('Did you hear about the toothpaste scandal?')
  1. Define a subclass of MyDeps in tests to customise the system prompt factory.
  2. Create an instance of the test dependency, we don't need to pass an http_client here as it's not used.
  3. Override the dependencies of the agent for the duration of the with block, test_deps will be used when the agent is run.
  4. Now we can safely call our application code, the agent will use the overridden dependencies.

Examples

The following examples demonstrate how to use dependencies in PydanticAI:

output/repo_parser/github_repos/pydantic/pydantic-ai/docs/multi-agent-applications.md

Multi-agent Applications

There are roughly four levels of complexity when building applications with PydanticAI:

  1. Single agent workflows — what most of the pydantic_ai documentation covers
  2. Agent delegation — agents using another agent via tools
  3. Programmatic agent hand-off — one agent runs, then application code calls another agent
  4. Graph based control flow — for the most complex cases, a graph-based state machine can be used to control the execution of multiple agents

Of course, you can combine multiple strategies in a single application.

Agent delegation

"Agent delegation" refers to the scenario where an agent delegates work to another agent, then takes back control when the delegate agent (the agent called from within a tool) finishes.

Since agents are stateless and designed to be global, you do not need to include the agent itself in agent dependencies.

You'll generally want to pass [ctx.usage][pydantic_ai.RunContext.usage] to the [usage][pydantic_ai.Agent.run] keyword argument of the delegate agent run so usage within that run counts towards the total usage of the parent agent run.

!!! note "Multiple models" Agent delegation doesn't need to use the same model for each agent. If you choose to use different models within a run, calculating the monetary cost from the final [result.usage()][pydantic_ai.result.RunResult.usage] of the run will not be possible, but you can still use [UsageLimits][pydantic_ai.usage.UsageLimits] to avoid unexpected costs.

from pydantic_ai import Agent, RunContext
from pydantic_ai.usage import UsageLimits

joke_selection_agent = Agent(  # (1)!
    'openai:gpt-4o',
    system_prompt=(
        'Use the `joke_factory` to generate some jokes, then choose the best. '
        'You must return just a single joke.'
    ),
)
joke_generation_agent = Agent('gemini-1.5-flash', result_type=list[str])  # (2)!


@joke_selection_agent.tool
async def joke_factory(ctx: RunContext[None], count: int) -> list[str]:
    r = await joke_generation_agent.run(  # (3)!
        f'Please generate {count} jokes.',
        usage=ctx.usage,  # (4)!
    )
    return r.data  # (5)!


result = joke_selection_agent.run_sync(
    'Tell me a joke.',
    usage_limits=UsageLimits(request_limit=5, total_tokens_limit=300),
)
print(result.data)
#> Did you hear about the toothpaste scandal? They called it Colgate.
print(result.usage())
"""
Usage(
    requests=3, request_tokens=204, response_tokens=24, total_tokens=228, details=None
)
"""
  1. The "parent" or controlling agent.
  2. The "delegate" agent, which is called from within a tool of the parent agent.
  3. Call the delegate agent from within a tool of the parent agent.
  4. Pass the usage from the parent agent to the delegate agent so the final [result.usage()][pydantic_ai.result.RunResult.usage] includes the usage from both agents.
  5. Since the function returns #!python list[str], and the result_type of joke_generation_agent is also #!python list[str], we can simply return #!python r.data from the tool.

(This example is complete, it can be run "as is")

The control flow for this example is pretty simple and can be summarised as follows:

graph TD
  START --> joke_selection_agent
  joke_selection_agent --> joke_factory["joke_factory (tool)"]
  joke_factory --> joke_generation_agent
  joke_generation_agent --> joke_factory
  joke_factory --> joke_selection_agent
  joke_selection_agent --> END
Loading

Agent delegation and dependencies

Generally the delegate agent needs to either have the same dependencies as the calling agent, or dependencies which are a subset of the calling agent's dependencies.

!!! info "Initializing dependencies" We say "generally" above since there's nothing to stop you initializing dependencies within a tool call and therefore using interdependencies in a delegate agent that are not available on the parent, this should often be avoided since it can be significantly slower than reusing connections etc. from the parent agent.

from dataclasses import dataclass

import httpx

from pydantic_ai import Agent, RunContext


@dataclass
class ClientAndKey:  # (1)!
    http_client: httpx.AsyncClient
    api_key: str


joke_selection_agent = Agent(
    'openai:gpt-4o',
    deps_type=ClientAndKey,  # (2)!
    system_prompt=(
        'Use the `joke_factory` tool to generate some jokes on the given subject, '
        'then choose the best. You must return just a single joke.'
    ),
)
joke_generation_agent = Agent(
    'gemini-1.5-flash',
    deps_type=ClientAndKey,  # (4)!
    result_type=list[str],
    system_prompt=(
        'Use the "get_jokes" tool to get some jokes on the given subject, '
        'then extract each joke into a list.'
    ),
)


@joke_selection_agent.tool
async def joke_factory(ctx: RunContext[ClientAndKey], count: int) -> list[str]:
    r = await joke_generation_agent.run(
        f'Please generate {count} jokes.',
        deps=ctx.deps,  # (3)!
        usage=ctx.usage,
    )
    return r.data


@joke_generation_agent.tool  # (5)!
async def get_jokes(ctx: RunContext[ClientAndKey], count: int) -> str:
    response = await ctx.deps.http_client.get(
        'https://example.com',
        params={'count': count},
        headers={'Authorization': f'Bearer {ctx.deps.api_key}'},
    )
    response.raise_for_status()
    return response.text


async def main():
    async with httpx.AsyncClient() as client:
        deps = ClientAndKey(client, 'foobar')
        result = await joke_selection_agent.run('Tell me a joke.', deps=deps)
        print(result.data)
        #> Did you hear about the toothpaste scandal? They called it Colgate.
        print(result.usage())  # (6)!
        """
        Usage(
            requests=4,
            request_tokens=310,
            response_tokens=32,
            total_tokens=342,
            details=None,
        )
        """
  1. Define a dataclass to hold the client and API key dependencies.
  2. Set the deps_type of the calling agent — joke_selection_agent here.
  3. Pass the dependencies to the delegate agent's run method within the tool call.
  4. Also set the deps_type of the delegate agent — joke_generation_agent here.
  5. Define a tool on the delegate agent that uses the dependencies to make an HTTP request.
  6. Usage now includes 4 requests — 2 from the calling agent and 2 from the delegate agent.

(This example is complete, it can be run "as is")

This example shows how even a fairly simple agent delegation can lead to a complex control flow:

graph TD
  START --> joke_selection_agent
  joke_selection_agent --> joke_factory["joke_factory (tool)"]
  joke_factory --> joke_generation_agent
  joke_generation_agent --> get_jokes["get_jokes (tool)"]
  get_jokes --> http_request["HTTP request"]
  http_request --> get_jokes
  get_jokes --> joke_generation_agent
  joke_generation_agent --> joke_factory
  joke_factory --> joke_selection_agent
  joke_selection_agent --> END
Loading

Programmatic agent hand-off

"Programmatic agent hand-off" refers to the scenario where multiple agents are called in succession, with application code and/or a human in the loop responsible for deciding which agent to call next.

Here agents don't need to use the same deps.

Here we show two agents used in succession, the first to find a flight and the second to extract the user's seat preference.

from typing import Literal, Union

from pydantic import BaseModel, Field
from rich.prompt import Prompt

from pydantic_ai import Agent, RunContext
from pydantic_ai.messages import ModelMessage
from pydantic_ai.usage import Usage, UsageLimits


class FlightDetails(BaseModel):
    flight_number: str


class Failed(BaseModel):
    """Unable to find a satisfactory choice."""


flight_search_agent = Agent[None, Union[FlightDetails, Failed]](  # (1)!
    'openai:gpt-4o',
    result_type=Union[FlightDetails, Failed],  # type: ignore
    system_prompt=(
        'Use the "flight_search" tool to find a flight '
        'from the given origin to the given destination.'
    ),
)


@flight_search_agent.tool  # (2)!
async def flight_search(
    ctx: RunContext[None], origin: str, destination: str
) -> Union[FlightDetails, None]:
    # in reality, this would call a flight search API or
    # use a browser to scrape a flight search website
    return FlightDetails(flight_number='AK456')


usage_limits = UsageLimits(request_limit=15)  # (3)!


async def find_flight(usage: Usage) -> Union[FlightDetails, None]:  # (4)!
    message_history: Union[list[ModelMessage], None] = None
    for _ in range(3):
        prompt = Prompt.ask(
            'Where would you like to fly from and to?',
        )
        result = await flight_search_agent.run(
            prompt,
            message_history=message_history,
            usage=usage,
            usage_limits=usage_limits,
        )
        if isinstance(result.data, FlightDetails):
            return result.data
        else:
            message_history = result.all_messages(
                result_tool_return_content='Please try again.'
            )


class SeatPreference(BaseModel):
    row: int = Field(ge=1, le=30)
    seat: Literal['A', 'B', 'C', 'D', 'E', 'F']


# This agent is responsible for extracting the user's seat selection
seat_preference_agent = Agent[None, Union[SeatPreference, Failed]](  # (5)!
    'openai:gpt-4o',
    result_type=Union[SeatPreference, Failed],  # type: ignore
    system_prompt=(
        "Extract the user's seat preference. "
        'Seats A and F are window seats. '
        'Row 1 is the front row and has extra leg room. '
        'Rows 14, and 20 also have extra leg room. '
    ),
)


async def find_seat(usage: Usage) -> SeatPreference:  # (6)!
    message_history: Union[list[ModelMessage], None] = None
    while True:
        answer = Prompt.ask('What seat would you like?')

        result = await seat_preference_agent.run(
            answer,
            message_history=message_history,
            usage=usage,
            usage_limits=usage_limits,
        )
        if isinstance(result.data, SeatPreference):
            return result.data
        else:
            print('Could not understand seat preference. Please try again.')
            message_history = result.all_messages()


async def main():  # (7)!
    usage: Usage = Usage()

    opt_flight_details = await find_flight(usage)
    if opt_flight_details is not None:
        print(f'Flight found: {opt_flight_details.flight_number}')
        #> Flight found: AK456
        seat_preference = await find_seat(usage)
        print(f'Seat preference: {seat_preference}')
        #> Seat preference: row=1 seat='A'
  1. Define the first agent, which finds a flight. We use an explicit type annotation until PEP-747 lands, see structured results. We use a union as the result type so the model can communicate if it's unable to find a satisfactory choice; internally, each member of the union will be registered as a separate tool.
  2. Define a tool on the agent to find a flight. In this simple case we could dispense with the tool and just define the agent to return structured data, then search for a flight, but in more complex scenarios the tool would be necessary.
  3. Define usage limits for the entire app.
  4. Define a function to find a flight, which asks the user for their preferences and then calls the agent to find a flight.
  5. As with flight_search_agent above, we use an explicit type annotation to define the agent.
  6. Define a function to find the user's seat preference, which asks the user for their seat preference and then calls the agent to extract the seat preference.
  7. Now that we've put our logic for running each agent into separate functions, our main app becomes very simple.

(This example is complete, it can be run "as is")

The control flow for this example can be summarised as follows:

graph TB
  START --> ask_user_flight["ask user for flight"]

  subgraph find_flight
    flight_search_agent --> ask_user_flight
    ask_user_flight --> flight_search_agent
  end

  flight_search_agent --> ask_user_seat["ask user for seat"]
  flight_search_agent --> END

  subgraph find_seat
    seat_preference_agent --> ask_user_seat
    ask_user_seat --> seat_preference_agent
  end

  seat_preference_agent --> END
Loading

PydanticAI Graphs

!!! example "Work in progress" This is a work in progress and not yet documented, see #528 and #539

Examples

The following examples demonstrate how to use dependencies in PydanticAI:

output/repo_parser/github_repos/pydantic/pydantic-ai/docs/help.md

Getting Help

If you need help getting started with PydanticAI or with advanced usage, the following sources may be useful.

:simple-slack: Slack

Join the #pydantic-ai channel in the Pydantic Slack to ask questions, get help, and chat about PydanticAI. There's also channels for Pydantic, Logfire, and FastUI.

If you're on a Logfire Pro plan, you can also get a dedicated private slack collab channel with us.

:simple-github: GitHub Issues

The PydanticAI GitHub Issues are a great place to ask questions and give us feedback.

output/repo_parser/github_repos/pydantic/pydantic-ai/docs/contributing.md

We'd love you to contribute to PydanticAI!

Installation and Setup

Clone your fork and cd into the repo directory

git clone [email protected]:<your username>/pydantic-ai.git
cd pydantic-ai

Install uv (version 0.4.30 or later) and pre-commit

We use pipx here, for other options see:

To get pipx itself, see these docs

pipx install uv pre-commit

Install pydantic-ai, all dependencies and pre-commit hooks

make install

Running Tests etc.

We use make to manage most commands you'll need to run.

For details on available commands, run:

make help

To run code formatting, linting, static type checks, and tests with coverage report generation, run:

make

Documentation Changes

To run the documentation page locally, run:

uv run mkdocs serve

Rules for adding new models to PydanticAI {#new-model-rules}

To avoid an excessive workload for the maintainers of PydanticAI, we can't accept all model contributions, so we're setting the following rules for when we'll accept new models and when we won't. This should hopefully reduce the chances of disappointment and wasted work.

  • To add a new model with an extra dependency, that dependency needs > 500k monthly downloads from PyPI consistently over 3 months or more
  • To add a new model which uses another models logic internally and has no extra dependencies, that model's GitHub org needs > 20k stars in total
  • For any other model that's just a custom URL and API key, we're happy to add a one-paragraph description with a link and instructions on the URL to use
  • For any other model that requires more logic, we recommend you release your own Python package pydantic-ai-xxx, which depends on pydantic-ai-slim and implements a model that inherits from our [Model][pydantic_ai.models.Model] ABC

If you're unsure about adding a model, please create an issue.

output/repo_parser/github_repos/pydantic/pydantic-ai/docs/models.md

PydanticAI is Model-agnostic and has built in support for the following model providers:

You can also add support for other models.

PydanticAI also comes with TestModel and FunctionModel for testing and development.

To use each model provider, you need to configure your local environment and make sure you have the right packages installed.

OpenAI

Install

To use OpenAI models, you need to either install pydantic-ai, or install pydantic-ai-slim with the openai optional group:

pip/uv-add 'pydantic-ai-slim[openai]'

Configuration

To use [OpenAIModel][pydantic_ai.models.openai.OpenAIModel] through their main API, go to platform.openai.com and follow your nose until you find the place to generate an API key.

Environment variable

Once you have the API key, you can set it as an environment variable:

export OPENAI_API_KEY='your-api-key'

You can then use [OpenAIModel][pydantic_ai.models.openai.OpenAIModel] by name:

from pydantic_ai import Agent

agent = Agent('openai:gpt-4o')
...

Or initialise the model directly with just the model name:

from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIModel

model = OpenAIModel('gpt-4o')
agent = Agent(model)
...

api_key argument

If you don't want to or can't set the environment variable, you can pass it at runtime via the [api_key argument][pydantic_ai.models.openai.OpenAIModel.init]:

from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIModel

model = OpenAIModel('gpt-4o', api_key='your-api-key')
agent = Agent(model)
...

base_url argument

To use another OpenAI-compatible API, such as OpenRouter, you can make use of the [base_url argument][pydantic_ai.models.openai.OpenAIModel.init]:

from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIModel

model = OpenAIModel(
    'anthropic/claude-3.5-sonnet',
    base_url='https://openrouter.ai/api/v1',
    api_key='your-api-key',
)
agent = Agent(model)
...

Custom OpenAI Client

OpenAIModel also accepts a custom AsyncOpenAI client via the [openai_client parameter][pydantic_ai.models.openai.OpenAIModel.init], so you can customise the organization, project, base_url etc. as defined in the OpenAI API docs.

You could also use the AsyncAzureOpenAI client to use the Azure OpenAI API.

from openai import AsyncAzureOpenAI

from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIModel

client = AsyncAzureOpenAI(
    azure_endpoint='...',
    api_version='2024-07-01-preview',
    api_key='your-api-key',
)

model = OpenAIModel('gpt-4o', openai_client=client)
agent = Agent(model)
...

Anthropic

Install

To use [AnthropicModel][pydantic_ai.models.anthropic.AnthropicModel] models, you need to either install pydantic-ai, or install pydantic-ai-slim with the anthropic optional group:

pip/uv-add 'pydantic-ai-slim[anthropic]'

Configuration

To use Anthropic through their API, go to console.anthropic.com/settings/keys to generate an API key.

[AnthropicModelName][pydantic_ai.models.anthropic.AnthropicModelName] contains a list of available Anthropic models.

Environment variable

Once you have the API key, you can set it as an environment variable:

export ANTHROPIC_API_KEY='your-api-key'

You can then use [AnthropicModel][pydantic_ai.models.anthropic.AnthropicModel] by name:

from pydantic_ai import Agent

agent = Agent('claude-3-5-sonnet-latest')
...

Or initialise the model directly with just the model name:

from pydantic_ai import Agent
from pydantic_ai.models.anthropic import AnthropicModel

model = AnthropicModel('claude-3-5-sonnet-latest')
agent = Agent(model)
...

api_key argument

If you don't want to or can't set the environment variable, you can pass it at runtime via the [api_key argument][pydantic_ai.models.anthropic.AnthropicModel.init]:

from pydantic_ai import Agent
from pydantic_ai.models.anthropic import AnthropicModel

model = AnthropicModel('claude-3-5-sonnet-latest', api_key='your-api-key')
agent = Agent(model)
...

Gemini

!!! warning "For prototyping only" Google themselves refer to this API as the "hobby" API, I've received 503 responses from it a number of times. The API is easy to use and useful for prototyping and simple demos, but I would not rely on it in production.

If you want to run Gemini models in production, you should use the [VertexAI API](#gemini-via-vertexai) described below.

Install

To use [GeminiModel][pydantic_ai.models.gemini.GeminiModel] models, you just need to install pydantic-ai or pydantic-ai-slim, no extra dependencies are required.

Configuration

[GeminiModel][pydantic_ai.models.gemini.GeminiModel] let's you use the Google's Gemini models through their Generative Language API, generativelanguage.googleapis.com.

[GeminiModelName][pydantic_ai.models.gemini.GeminiModelName] contains a list of available Gemini models that can be used through this interface.

To use GeminiModel, go to aistudio.google.com and follow your nose until you find the place to generate an API key.

Environment variable

Once you have the API key, you can set it as an environment variable:

export GEMINI_API_KEY=your-api-key

You can then use [GeminiModel][pydantic_ai.models.gemini.GeminiModel] by name:

from pydantic_ai import Agent

agent = Agent('gemini-1.5-flash')
...

Or initialise the model directly with just the model name:

from pydantic_ai import Agent
from pydantic_ai.models.gemini import GeminiModel

model = GeminiModel('gemini-1.5-flash')
agent = Agent(model)
...

api_key argument

If you don't want to or can't set the environment variable, you can pass it at runtime via the [api_key argument][pydantic_ai.models.gemini.GeminiModel.init]:

from pydantic_ai import Agent
from pydantic_ai.models.gemini import GeminiModel

model = GeminiModel('gemini-1.5-flash', api_key='your-api-key')
agent = Agent(model)
...

Gemini via VertexAI

To run Google's Gemini models in production, you should use [VertexAIModel][pydantic_ai.models.vertexai.VertexAIModel] which uses the *-aiplatform.googleapis.com API.

[GeminiModelName][pydantic_ai.models.gemini.GeminiModelName] contains a list of available Gemini models that can be used through this interface.

Install

To use [VertexAIModel][pydantic_ai.models.vertexai.VertexAIModel], you need to either install pydantic-ai, or install pydantic-ai-slim with the vertexai optional group:

pip/uv-add 'pydantic-ai-slim[vertexai]'

Configuration

This interface has a number of advantages over generativelanguage.googleapis.com documented above:

  1. The VertexAI API is more reliably and marginally lower latency in our experience.
  2. You can purchase provisioned throughput with VertexAI to guarantee capacity.
  3. If you're running PydanticAI inside GCP, you don't need to set up authentication, it should "just work".
  4. You can decide which region to use, which might be important from a regulatory perspective, and might improve latency.

The big disadvantage is that for local development you may need to create and configure a "service account", which I've found extremely painful to get right in the past.

Whichever way you authenticate, you'll need to have VertexAI enabled in your GCP account.

Application default credentials

Luckily if you're running PydanticAI inside GCP, or you have the gcloud CLI installed and configured, you should be able to use VertexAIModel without any additional setup.

To use VertexAIModel, with application default credentials configured (e.g. with gcloud), you can simply use:

from pydantic_ai import Agent
from pydantic_ai.models.vertexai import VertexAIModel

model = VertexAIModel('gemini-1.5-flash')
agent = Agent(model)
...

Internally this uses google.auth.default() from the google-auth package to obtain credentials.

!!! note "Won't fail until agent.run()" Because google.auth.default() requires network requests and can be slow, it's not run until you call agent.run(). Meaning any configuration or permissions error will only be raised when you try to use the model. To for this check to be run, call [await model.ainit()][pydantic_ai.models.vertexai.VertexAIModel.ainit].

You may also need to pass the [project_id argument to VertexAIModel][pydantic_ai.models.vertexai.VertexAIModel.init] if application default credentials don't set a project, if you pass project_id and it conflicts with the project set by application default credentials, an error is raised.

Service account

If instead of application default credentials, you want to authenticate with a service account, you'll need to create a service account, add it to your GCP project (note: AFAIK this step is necessary even if you created the service account within the project), give that service account the "Vertex AI Service Agent" role, and download the service account JSON file.

Once you have the JSON file, you can use it thus:

from pydantic_ai import Agent
from pydantic_ai.models.vertexai import VertexAIModel

model = VertexAIModel(
    'gemini-1.5-flash',
    service_account_file='path/to/service-account.json',
)
agent = Agent(model)
...

Customising region

Whichever way you authenticate, you can specify which region requests will be sent to via the [region argument][pydantic_ai.models.vertexai.VertexAIModel.init].

Using a region close to your application can improve latency and might be important from a regulatory perspective.

from pydantic_ai import Agent
from pydantic_ai.models.vertexai import VertexAIModel

model = VertexAIModel('gemini-1.5-flash', region='asia-east1')
agent = Agent(model)
...

[VertexAiRegion][pydantic_ai.models.vertexai.VertexAiRegion] contains a list of available regions.

Ollama

Install

To use [OllamaModel][pydantic_ai.models.ollama.OllamaModel], you need to either install pydantic-ai, or install pydantic-ai-slim with the openai optional group:

pip/uv-add 'pydantic-ai-slim[openai]'

This is because internally, OllamaModel uses the OpenAI API.

Configuration

To use Ollama, you must first download the Ollama client, and then download a model using the Ollama model library.

You must also ensure the Ollama server is running when trying to make requests to it. For more information, please see the Ollama documentation.

For detailed setup and example, please see the Ollama setup documentation.

Groq

Install

To use [GroqModel][pydantic_ai.models.groq.GroqModel], you need to either install pydantic-ai, or install pydantic-ai-slim with the groq optional group:

pip/uv-add 'pydantic-ai-slim[groq]'

Configuration

To use Groq through their API, go to console.groq.com/keys and follow your nose until you find the place to generate an API key.

[GroqModelName][pydantic_ai.models.groq.GroqModelName] contains a list of available Groq models.

Environment variable

Once you have the API key, you can set it as an environment variable:

export GROQ_API_KEY='your-api-key'

You can then use [GroqModel][pydantic_ai.models.groq.GroqModel] by name:

from pydantic_ai import Agent

agent = Agent('groq:llama-3.1-70b-versatile')
...

Or initialise the model directly with just the model name:

from pydantic_ai import Agent
from pydantic_ai.models.groq import GroqModel

model = GroqModel('llama-3.1-70b-versatile')
agent = Agent(model)
...

api_key argument

If you don't want to or can't set the environment variable, you can pass it at runtime via the [api_key argument][pydantic_ai.models.groq.GroqModel.init]:

from pydantic_ai import Agent
from pydantic_ai.models.groq import GroqModel

model = GroqModel('llama-3.1-70b-versatile', api_key='your-api-key')
agent = Agent(model)
...

Mistral

Install

To use [MistralModel][pydantic_ai.models.mistral.MistralModel], you need to either install pydantic-ai, or install pydantic-ai-slim with the mistral optional group:

pip/uv-add 'pydantic-ai-slim[mistral]'

Configuration

To use Mistral through their API, go to console.mistral.ai/api-keys/ and follow your nose until you find the place to generate an API key.

[NamedMistralModels][pydantic_ai.models.mistral.NamedMistralModels] contains a list of the most popular Mistral models.

Environment variable

Once you have the API key, you can set it as an environment variable:

export MISTRAL_API_KEY='your-api-key'

You can then use [MistralModel][pydantic_ai.models.mistral.MistralModel] by name:

from pydantic_ai import Agent

agent = Agent('mistral:mistral-large-latest')
...

Or initialise the model directly with just the model name:

from pydantic_ai import Agent
from pydantic_ai.models.mistral import MistralModel

model = MistralModel('mistral-small-latest')
agent = Agent(model)
...

api_key argument

If you don't want to or can't set the environment variable, you can pass it at runtime via the [api_key argument][pydantic_ai.models.mistral.MistralModel.init]:

from pydantic_ai import Agent
from pydantic_ai.models.mistral import MistralModel

model = MistralModel('mistral-small-latest', api_key='your-api-key')
agent = Agent(model)
...

Implementing Custom Models

To implement support for models not already supported, you will need to subclass the [Model][pydantic_ai.models.Model] abstract base class.

This in turn will require you to implement the following other abstract base classes:

  • [AgentModel][pydantic_ai.models.AgentModel]
  • [StreamTextResponse][pydantic_ai.models.StreamTextResponse]
  • [StreamStructuredResponse][pydantic_ai.models.StreamStructuredResponse]

The best place to start is to review the source code for existing implementations, e.g. OpenAIModel.

For details on when we'll accept contributions adding new models to PydanticAI, see the contributing guidelines.

output/repo_parser/github_repos/pydantic/pydantic-ai/docs/index.md

Introduction {.hide}

--8<-- "docs/.partials/index-header.html"

PydanticAI is a Python Agent Framework designed to make it less painful to build production grade applications with Generative AI.

FastAPI revolutionized web development by offering an innovative and ergonomic design, built on the foundation of Pydantic.

Similarly, virtually every agent framework and LLM library in Python uses Pydantic, yet when we began to use LLMs in Pydantic Logfire, we couldn't find anything that gave us the same feeling.

We built PydanticAI with one simple aim: to bring that FastAPI feeling to GenAI app development.

Why use PydanticAI

:material-account-group:{ .md .middle .team-blue } Built by the Pydantic Team
Built by the team behind Pydantic (the validation layer of the OpenAI SDK, the Anthropic SDK, LangChain, LlamaIndex, AutoGPT, Transformers, CrewAI, Instructor and many more).

:fontawesome-solid-shapes:{ .md .middle .shapes-orange } Model-agnostic
Supports OpenAI, Anthropic, Gemini, Ollama, Groq, and Mistral, and there is a simple interface to implement support for other models.

:logfire-logo:{ .md .middle } Pydantic Logfire Integration
Seamlessly integrates with Pydantic Logfire for real-time debugging, performance monitoring, and behavior tracking of your LLM-powered applications.

:material-shield-check:{ .md .middle .secure-green } Type-safe
Designed to make type checking as useful as possible for you, so it integrates well with static type checkers, like mypy and pyright.

🐍{ .md .middle } Python-centric Design
Leverages Python’s familiar control flow and agent composition to build your AI-driven projects, making it easy to apply standard Python best practices you'd use in any other (non-AI) project

:simple-pydantic:{ .md .middle .pydantic-pink } Structured Responses
Harnesses the power of Pydantic to validate and structure model outputs, ensuring responses are consistent across runs.

:material-puzzle-plus:{ .md .middle .puzzle-purple } Dependency Injection System
Offers an optional dependency injection system to provide data and services to your agent's system prompts, tools and result validators. This is useful for testing and eval-driven iterative development.

:material-sine-wave:{ .md .middle } Streamed Responses
Provides the ability to stream LLM outputs continuously, with immediate validation, ensuring rapid and accurate results.

!!! example "In Beta" PydanticAI is in early beta, the API is still subject to change and there's a lot more to do. Feedback is very welcome!

Hello World Example

Here's a minimal example of PydanticAI:

from pydantic_ai import Agent

agent = Agent(  # (1)!
    'gemini-1.5-flash',
    system_prompt='Be concise, reply with one sentence.',  # (2)!
)

result = agent.run_sync('Where does "hello world" come from?')  # (3)!
print(result.data)
"""
The first known use of "hello, world" was in a 1974 textbook about the C programming language.
"""
  1. We configure the agent to use Gemini 1.5's Flash model, but you can also set the model when running the agent.
  2. Register a static system prompt using a keyword argument to the agent.
  3. Run the agent synchronously, conducting a conversation with the LLM.

(This example is complete, it can be run "as is")

The exchange should be very short: PydanticAI will send the system prompt and the user query to the LLM, the model will return a text response.

Not very interesting yet, but we can easily add "tools", dynamic system prompts, and structured responses to build more powerful agents.

Tools & Dependency Injection Example

Here is a concise example using PydanticAI to build a support agent for a bank:

from dataclasses import dataclass

from pydantic import BaseModel, Field
from pydantic_ai import Agent, RunContext

from bank_database import DatabaseConn


@dataclass
class SupportDependencies:  # (3)!
    customer_id: int
    db: DatabaseConn  # (12)!


class SupportResult(BaseModel):  # (13)!
    support_advice: str = Field(description='Advice returned to the customer')
    block_card: bool = Field(description="Whether to block the customer's card")
    risk: int = Field(description='Risk level of query', ge=0, le=10)


support_agent = Agent(  # (1)!
    'openai:gpt-4o',  # (2)!
    deps_type=SupportDependencies,
    result_type=SupportResult,  # (9)!
    system_prompt=(  # (4)!
        'You are a support agent in our bank, give the '
        'customer support and judge the risk level of their query.'
    ),
)


@support_agent.system_prompt  # (5)!
async def add_customer_name(ctx: RunContext[SupportDependencies]) -> str:
    customer_name = await ctx.deps.db.customer_name(id=ctx.deps.customer_id)
    return f"The customer's name is {customer_name!r}"


@support_agent.tool  # (6)!
async def customer_balance(
    ctx: RunContext[SupportDependencies], include_pending: bool
) -> float:
    """Returns the customer's current account balance."""  # (7)!
    return await ctx.deps.db.customer_balance(
        id=ctx.deps.customer_id,
        include_pending=include_pending,
    )


...  # (11)!


async def main():
    deps = SupportDependencies(customer_id=123, db=DatabaseConn())
    result = await support_agent.run('What is my balance?', deps=deps)  # (8)!
    print(result.data)  # (10)!
    """
    support_advice='Hello John, your current account balance, including pending transactions, is $123.45.' block_card=False risk=1
    """

    result = await support_agent.run('I just lost my card!', deps=deps)
    print(result.data)
    """
    support_advice="I'm sorry to hear that, John. We are temporarily blocking your card to prevent unauthorized transactions." block_card=True risk=8
    """
  1. This agent will act as first-tier support in a bank. Agents are generic in the type of dependencies they accept and the type of result they return. In this case, the support agent has type #!python Agent[SupportDependencies, SupportResult].
  2. Here we configure the agent to use OpenAI's GPT-4o model, you can also set the model when running the agent.
  3. The SupportDependencies dataclass is used to pass data, connections, and logic into the model that will be needed when running system prompt and tool functions. PydanticAI's system of dependency injection provides a type-safe way to customise the behavior of your agents, and can be especially useful when running unit tests and evals.
  4. Static system prompts can be registered with the [system_prompt keyword argument][pydantic_ai.Agent.init] to the agent.
  5. Dynamic system prompts can be registered with the [@agent.system_prompt][pydantic_ai.Agent.system_prompt] decorator, and can make use of dependency injection. Dependencies are carried via the [RunContext][pydantic_ai.tools.RunContext] argument, which is parameterized with the deps_type from above. If the type annotation here is wrong, static type checkers will catch it.
  6. tool let you register functions which the LLM may call while responding to a user. Again, dependencies are carried via [RunContext][pydantic_ai.tools.RunContext], any other arguments become the tool schema passed to the LLM. Pydantic is used to validate these arguments, and errors are passed back to the LLM so it can retry.
  7. The docstring of a tool is also passed to the LLM as the description of the tool. Parameter descriptions are extracted from the docstring and added to the parameter schema sent to the LLM.
  8. Run the agent asynchronously, conducting a conversation with the LLM until a final response is reached. Even in this fairly simple case, the agent will exchange multiple messages with the LLM as tools are called to retrieve a result.
  9. The response from the agent will, be guaranteed to be a SupportResult, if validation fails reflection will mean the agent is prompted to try again.
  10. The result will be validated with Pydantic to guarantee it is a SupportResult, since the agent is generic, it'll also be typed as a SupportResult to aid with static type checking.
  11. In a real use case, you'd add more tools and a longer system prompt to the agent to extend the context it's equipped with and support it can provide.
  12. This is a simple sketch of a database connection, used to keep the example short and readable. In reality, you'd be connecting to an external database (e.g. PostgreSQL) to get information about customers.
  13. This Pydantic model is used to constrain the structured data returned by the agent. From this simple definition, Pydantic builds the JSON Schema that tells the LLM how to return the data, and performs validation to guarantee the data is correct at the end of the run.

!!! tip "Complete bank_support.py example" The code included here is incomplete for the sake of brevity (the definition of DatabaseConn is missing); you can find the complete bank_support.py example here.

Instrumentation with Pydantic Logfire

To understand the flow of the above runs, we can watch the agent in action using Pydantic Logfire.

To do this, we need to set up logfire, and add the following to our code:

...
from bank_database import DatabaseConn

import logfire
logfire.configure()  # (1)!
logfire.instrument_asyncpg()  # (2)!
...
  1. Configure logfire, this will fail if not project is set up.
  2. In our demo, DatabaseConn uses asyncpg to connect to a PostgreSQL database, so logfire.instrument_asyncpg() is used to log the database queries.

That's enough to get the following view of your agent in action:

{{ video('9078b98c4f75d01f912a0368bbbdb97a', 25, 55) }}

See Monitoring and Performance to learn more.

Next Steps

To try PydanticAI yourself, follow the instructions in the examples.

Read the docs to learn more about building applications with PydanticAI.

Read the API Reference to understand PydanticAI's interface.

output/repo_parser/github_repos/pydantic/pydantic-ai/docs/results.md

Results are the final values returned from running an agent. The result values are wrapped in [RunResult][pydantic_ai.result.RunResult] and [StreamedRunResult][pydantic_ai.result.StreamedRunResult] so you can access other data like [usage][pydantic_ai.result.Usage] of the run and message history

Both RunResult and StreamedRunResult are generic in the data they wrap, so typing information about the data returned by the agent is preserved.

from pydantic import BaseModel

from pydantic_ai import Agent


class CityLocation(BaseModel):
    city: str
    country: str


agent = Agent('gemini-1.5-flash', result_type=CityLocation)
result = agent.run_sync('Where were the olympics held in 2012?')
print(result.data)
#> city='London' country='United Kingdom'
print(result.usage())
"""
Usage(requests=1, request_tokens=57, response_tokens=8, total_tokens=65, details=None)
"""

(This example is complete, it can be run "as is")

Runs end when either a plain text response is received or the model calls a tool associated with one of the structured result types. We will add limits to make sure a run doesn't go on indefinitely, see #70.

Result data {#structured-result-validation}

When the result type is str, or a union including str, plain text responses are enabled on the model, and the raw text response from the model is used as the response data.

If the result type is a union with multiple members (after remove str from the members), each member is registered as a separate tool with the model in order to reduce the complexity of the tool schemas and maximise the changes a model will respond correctly.

If the result type schema is not of type "object", the result type is wrapped in a single element object, so the schema of all tools registered with the model are object schemas.

Structured results (like tools) use Pydantic to build the JSON schema used for the tool, and to validate the data returned by the model.

!!! note "Bring on PEP-747" Until PEP-747 "Annotating Type Forms" lands, unions are not valid as types in Python.

When creating the agent we need to `# type: ignore` the `result_type` argument, and add a type hint to tell type checkers about the type of the agent.

Here's an example of returning either text or a structured value

from typing import Union

from pydantic import BaseModel

from pydantic_ai import Agent


class Box(BaseModel):
    width: int
    height: int
    depth: int
    units: str


agent: Agent[None, Union[Box, str]] = Agent(
    'openai:gpt-4o-mini',
    result_type=Union[Box, str],  # type: ignore
    system_prompt=(
        "Extract me the dimensions of a box, "
        "if you can't extract all data, ask the user to try again."
    ),
)

result = agent.run_sync('The box is 10x20x30')
print(result.data)
#> Please provide the units for the dimensions (e.g., cm, in, m).

result = agent.run_sync('The box is 10x20x30 cm')
print(result.data)
#> width=10 height=20 depth=30 units='cm'

(This example is complete, it can be run "as is")

Here's an example of using a union return type which registered multiple tools, and wraps non-object schemas in an object:

from typing import Union

from pydantic_ai import Agent

agent: Agent[None, Union[list[str], list[int]]] = Agent(
    'openai:gpt-4o-mini',
    result_type=Union[list[str], list[int]],  # type: ignore
    system_prompt='Extract either colors or sizes from the shapes provided.',
)

result = agent.run_sync('red square, blue circle, green triangle')
print(result.data)
#> ['red', 'blue', 'green']

result = agent.run_sync('square size 10, circle size 20, triangle size 30')
print(result.data)
#> [10, 20, 30]

(This example is complete, it can be run "as is")

Result validators functions

Some validation is inconvenient or impossible to do in Pydantic validators, in particular when the validation requires IO and is asynchronous. PydanticAI provides a way to add validation functions via the [agent.result_validator][pydantic_ai.Agent.result_validator] decorator.

Here's a simplified variant of the SQL Generation example:

from typing import Union

from fake_database import DatabaseConn, QueryError
from pydantic import BaseModel

from pydantic_ai import Agent, RunContext, ModelRetry


class Success(BaseModel):
    sql_query: str


class InvalidRequest(BaseModel):
    error_message: str


Response = Union[Success, InvalidRequest]
agent: Agent[DatabaseConn, Response] = Agent(
    'gemini-1.5-flash',
    result_type=Response,  # type: ignore
    deps_type=DatabaseConn,
    system_prompt='Generate PostgreSQL flavored SQL queries based on user input.',
)


@agent.result_validator
async def validate_result(ctx: RunContext[DatabaseConn], result: Response) -> Response:
    if isinstance(result, InvalidRequest):
        return result
    try:
        await ctx.deps.execute(f'EXPLAIN {result.sql_query}')
    except QueryError as e:
        raise ModelRetry(f'Invalid query: {e}') from e
    else:
        return result


result = agent.run_sync(
    'get me uses who were last active yesterday.', deps=DatabaseConn()
)
print(result.data)
#> sql_query='SELECT * FROM users WHERE last_active::date = today() - interval 1 day'

(This example is complete, it can be run "as is")

Streamed Results

There two main challenges with streamed results:

  1. Validating structured responses before they're complete, this is achieved by "partial validation" which was recently added to Pydantic in pydantic/pydantic#10748.
  2. When receiving a response, we don't know if it's the final response without starting to stream it and peeking at the content. PydanticAI streams just enough of the response to sniff out if it's a tool call or a result, then streams the whole thing and calls tools, or returns the stream as a [StreamedRunResult][pydantic_ai.result.StreamedRunResult].

Streaming Text

Example of streamed text result:

from pydantic_ai import Agent

agent = Agent('gemini-1.5-flash')  # (1)!


async def main():
    async with agent.run_stream('Where does "hello world" come from?') as result:  # (2)!
        async for message in result.stream():  # (3)!
            print(message)
            #> The first known
            #> The first known use of "hello,
            #> The first known use of "hello, world" was in
            #> The first known use of "hello, world" was in a 1974 textbook
            #> The first known use of "hello, world" was in a 1974 textbook about the C
            #> The first known use of "hello, world" was in a 1974 textbook about the C programming language.
  1. Streaming works with the standard [Agent][pydantic_ai.Agent] class, and doesn't require any special setup, just a model that supports streaming (currently all models support streaming).
  2. The [Agent.run_stream()][pydantic_ai.Agent.run_stream] method is used to start a streamed run, this method returns a context manager so the connection can be closed when the stream completes.
  3. Each item yield by [StreamedRunResult.stream()][pydantic_ai.result.StreamedRunResult.stream] is the complete text response, extended as new data is received.

(This example is complete, it can be run "as is")

We can also stream text as deltas rather than the entire text in each item:

from pydantic_ai import Agent

agent = Agent('gemini-1.5-flash')


async def main():
    async with agent.run_stream('Where does "hello world" come from?') as result:
        async for message in result.stream_text(delta=True):  # (1)!
            print(message)
            #> The first known
            #> use of "hello,
            #> world" was in
            #> a 1974 textbook
            #> about the C
            #> programming language.
  1. [stream_text][pydantic_ai.result.StreamedRunResult.stream_text] will error if the response is not text

(This example is complete, it can be run "as is")

!!! warning "Result message not included in messages" The final result message will NOT be added to result messages if you use .stream_text(delta=True), see Messages and chat history for more information.

Streaming Structured Responses

Not all types are supported with partial validation in Pydantic, see pydantic/pydantic#10748, generally for model-like structures it's currently best to use TypeDict.

Here's an example of streaming a use profile as it's built:

from datetime import date

from typing_extensions import TypedDict

from pydantic_ai import Agent


class UserProfile(TypedDict, total=False):
    name: str
    dob: date
    bio: str


agent = Agent(
    'openai:gpt-4o',
    result_type=UserProfile,
    system_prompt='Extract a user profile from the input',
)


async def main():
    user_input = 'My name is Ben, I was born on January 28th 1990, I like the chain the dog and the pyramid.'
    async with agent.run_stream(user_input) as result:
        async for profile in result.stream():
            print(profile)
            #> {'name': 'Ben'}
            #> {'name': 'Ben'}
            #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes'}
            #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes the chain the '}
            #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes the chain the dog and the pyr'}
            #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes the chain the dog and the pyramid'}
            #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes the chain the dog and the pyramid'}

(This example is complete, it can be run "as is")

If you want fine-grained control of validation, particularly catching validation errors, you can use the following pattern:

from datetime import date

from pydantic import ValidationError
from typing_extensions import TypedDict

from pydantic_ai import Agent


class UserProfile(TypedDict, total=False):
    name: str
    dob: date
    bio: str


agent = Agent('openai:gpt-4o', result_type=UserProfile)


async def main():
    user_input = 'My name is Ben, I was born on January 28th 1990, I like the chain the dog and the pyramid.'
    async with agent.run_stream(user_input) as result:
        async for message, last in result.stream_structured(debounce_by=0.01):  # (1)!
            try:
                profile = await result.validate_structured_result(  # (2)!
                    message,
                    allow_partial=not last,
                )
            except ValidationError:
                continue
            print(profile)
            #> {'name': 'Ben'}
            #> {'name': 'Ben'}
            #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes'}
            #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes the chain the '}
            #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes the chain the dog and the pyr'}
            #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes the chain the dog and the pyramid'}
            #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes the chain the dog and the pyramid'}
  1. [stream_structured][pydantic_ai.result.StreamedRunResult.stream_structured] streams the data as [ModelResponse][pydantic_ai.messages.ModelResponse] objects, thus iteration can't fail with a ValidationError.
  2. [validate_structured_result][pydantic_ai.result.StreamedRunResult.validate_structured_result] validates the data, allow_partial=True enables pydantic's [experimental_allow_partial flag on TypeAdapter][pydantic.type_adapter.TypeAdapter.validate_json].

(This example is complete, it can be run "as is")

Examples

The following examples demonstrate how to use streamed responses in PydanticAI:

output/repo_parser/github_repos/pydantic/pydantic-ai/docs/testing-evals.md

Testing and Evals

With PydanticAI and LLM integrations in general, there are two distinct kinds of test:

  1. Unit tests — tests of your application code, and whether it's behaving correctly
  2. Evals — tests of the LLM, and how good or bad its responses are

For the most part, these two kinds of tests have pretty separate goals and considerations.

Unit tests

Unit tests for PydanticAI code are just like unit tests for any other Python code.

Because for the most part they're nothing new, we have pretty well established tools and patterns for writing and running these kinds of tests.

Unless you're really sure you know better, you'll probably want to follow roughly this strategy:

  • Use pytest as your test harness
  • If you find yourself typing out long assertions, use inline-snapshot
  • Similarly, dirty-equals can be useful for comparing large data structures
  • Use [TestModel][pydantic_ai.models.test.TestModel] or [FunctionModel][pydantic_ai.models.function.FunctionModel] in place of your actual model to avoid the usage, latency and variability of real LLM calls
  • Use [Agent.override][pydantic_ai.agent.Agent.override] to replace your model inside your application logic
  • Set [ALLOW_MODEL_REQUESTS=False][pydantic_ai.models.ALLOW_MODEL_REQUESTS] globally to block any requests from being made to non-test models accidentally

Unit testing with TestModel

The simplest and fastest way to exercise most of your application code is using [TestModel][pydantic_ai.models.test.TestModel], this will (by default) call all tools in the agent, then return either plain text or a structured response depending on the return type of the agent.

!!! note "TestModel is not magic" The "clever" (but not too clever) part of TestModel is that it will attempt to generate valid structured data for function tools and result types based on the schema of the registered tools.

There's no ML or AI in `TestModel`, it's just plain old procedural Python code that tries to generate data that satisfies the JSON schema of a tool.

The resulting data won't look pretty or relevant, but it should pass Pydantic's validation in most cases.
If you want something more sophisticated, use [`FunctionModel`][pydantic_ai.models.function.FunctionModel] and write your own data generation logic.

Let's write unit tests for the following application code:

import asyncio
from datetime import date

from pydantic_ai import Agent, RunContext

from fake_database import DatabaseConn  # (1)!
from weather_service import WeatherService  # (2)!

weather_agent = Agent(
    'openai:gpt-4o',
    deps_type=WeatherService,
    system_prompt='Providing a weather forecast at the locations the user provides.',
)


@weather_agent.tool
def weather_forecast(
    ctx: RunContext[WeatherService], location: str, forecast_date: date
) -> str:
    if forecast_date < date.today():  # (3)!
        return ctx.deps.get_historic_weather(location, forecast_date)
    else:
        return ctx.deps.get_forecast(location, forecast_date)


async def run_weather_forecast(  # (3)!
    user_prompts: list[tuple[str, int]], conn: DatabaseConn
):
    """Run weather forecast for a list of user prompts and save."""
    async with WeatherService() as weather_service:

        async def run_forecast(prompt: str, user_id: int):
            result = await weather_agent.run(prompt, deps=weather_service)
            await conn.store_forecast(user_id, result.data)

        # run all prompts in parallel
        await asyncio.gather(
            *(run_forecast(prompt, user_id) for (prompt, user_id) in user_prompts)
        )
  1. DatabaseConn is a class that holds a database connection
  2. WeatherService has methods to get weather forecasts and historic data about the weather
  3. We need to call a different endpoint depending on whether the date is in the past or the future, you'll see why this nuance is important below
  4. This function is the code we want to test, together with the agent it uses

Here we have a function that takes a list of #!python (user_prompt, user_id) tuples, gets a weather forecast for each prompt, and stores the result in the database.

We want to test this code without having to mock certain objects or modify our code so we can pass test objects in.

Here's how we would write tests using [TestModel][pydantic_ai.models.test.TestModel]:

from datetime import timezone
import pytest

from dirty_equals import IsNow

from pydantic_ai import models, capture_run_messages
from pydantic_ai.models.test import TestModel
from pydantic_ai.messages import (
    ArgsDict,
    ModelResponse,
    SystemPromptPart,
    TextPart,
    ToolCallPart,
    ToolReturnPart,
    UserPromptPart,
    ModelRequest,
)

from fake_database import DatabaseConn
from weather_app import run_weather_forecast, weather_agent

pytestmark = pytest.mark.anyio  # (1)!
models.ALLOW_MODEL_REQUESTS = False  # (2)!


async def test_forecast():
    conn = DatabaseConn()
    user_id = 1
    with capture_run_messages() as messages:
        with weather_agent.override(model=TestModel()):  # (3)!
            prompt = 'What will the weather be like in London on 2024-11-28?'
            await run_weather_forecast([(prompt, user_id)], conn)  # (4)!

    forecast = await conn.get_forecast(user_id)
    assert forecast == '{"weather_forecast":"Sunny with a chance of rain"}'  # (5)!

    assert messages == [  # (6)!
        ModelRequest(
            parts=[
                SystemPromptPart(
                    content='Providing a weather forecast at the locations the user provides.',
                ),
                UserPromptPart(
                    content='What will the weather be like in London on 2024-11-28?',
                    timestamp=IsNow(tz=timezone.utc),  # (7)!
                ),
            ]
        ),
        ModelResponse(
            parts=[
                ToolCallPart(
                    tool_name='weather_forecast',
                    args=ArgsDict(
                        args_dict={
                            'location': 'a',
                            'forecast_date': '2024-01-01',  # (8)!
                        }
                    ),
                    tool_call_id=None,
                )
            ],
            timestamp=IsNow(tz=timezone.utc),
        ),
        ModelRequest(
            parts=[
                ToolReturnPart(
                    tool_name='weather_forecast',
                    content='Sunny with a chance of rain',
                    tool_call_id=None,
                    timestamp=IsNow(tz=timezone.utc),
                ),
            ],
        ),
        ModelResponse(
            parts=[
                TextPart(
                    content='{"weather_forecast":"Sunny with a chance of rain"}',
                )
            ],
            timestamp=IsNow(tz=timezone.utc),
        ),
    ]
  1. We're using anyio to run async tests.
  2. This is a safety measure to make sure we don't accidentally make real requests to the LLM while testing, see [ALLOW_MODEL_REQUESTS][pydantic_ai.models.ALLOW_MODEL_REQUESTS] for more details.
  3. We're using [Agent.override][pydantic_ai.agent.Agent.override] to replace the agent's model with [TestModel][pydantic_ai.models.test.TestModel], the nice thing about override is that we can replace the model inside agent without needing access to the agent run* methods call site.
  4. Now we call the function we want to test inside the override context manager.
  5. But default, TestModel will return a JSON string summarising the tools calls made, and what was returned. If you wanted to customise the response to something more closely aligned with the domain, you could add [custom_result_text='Sunny'][pydantic_ai.models.test.TestModel.custom_result_text] when defining TestModel.
  6. So far we don't actually know which tools were called and with which values, we can use [capture_run_messages][pydantic_ai.capture_run_messages] to inspect messages from the most recent run and assert the exchange between the agent and the model occurred as expected.
  7. The [IsNow][dirty_equals.IsNow] helper allows us to use declarative asserts even with data which will contain timestamps that change over time.
  8. TestModel isn't doing anything clever to extract values from the prompt, so these values are hardcoded.

Unit testing with FunctionModel

The above tests are a great start, but careful readers will notice that the WeatherService.get_forecast is never called since TestModel calls weather_forecast with a date in the past.

To fully exercise weather_forecast, we need to use [FunctionModel][pydantic_ai.models.function.FunctionModel] to customise how the tools is called.

Here's an example of using FunctionModel to test the weather_forecast tool with custom inputs

import re

import pytest

from pydantic_ai import models
from pydantic_ai.messages import (
    ModelMessage,
    ModelResponse,
    ToolCallPart,
)
from pydantic_ai.models.function import AgentInfo, FunctionModel

from fake_database import DatabaseConn
from weather_app import run_weather_forecast, weather_agent

pytestmark = pytest.mark.anyio
models.ALLOW_MODEL_REQUESTS = False


def call_weather_forecast(  # (1)!
    messages: list[ModelMessage], info: AgentInfo
) -> ModelResponse:
    if len(messages) == 1:
        # first call, call the weather forecast tool
        user_prompt = messages[0].parts[-1]
        m = re.search(r'\d{4}-\d{2}-\d{2}', user_prompt.content)
        assert m is not None
        args = {'location': 'London', 'forecast_date': m.group()}  # (2)!
        return ModelResponse(
            parts=[ToolCallPart.from_raw_args('weather_forecast', args)]
        )
    else:
        # second call, return the forecast
        msg = messages[-1].parts[0]
        assert msg.part_kind == 'tool-return'
        return ModelResponse.from_text(f'The forecast is: {msg.content}')


async def test_forecast_future():
    conn = DatabaseConn()
    user_id = 1
    with weather_agent.override(model=FunctionModel(call_weather_forecast)):  # (3)!
        prompt = 'What will the weather be like in London on 2032-01-01?'
        await run_weather_forecast([(prompt, user_id)], conn)

    forecast = await conn.get_forecast(user_id)
    assert forecast == 'The forecast is: Rainy with a chance of sun'
  1. We define a function call_weather_forecast that will be called by FunctionModel in place of the LLM, this function has access to the list of [ModelMessage][pydantic_ai.messages.ModelMessage]s that make up the run, and [AgentInfo][pydantic_ai.models.function.AgentInfo] which contains information about the agent and the function tools and return tools.
  2. Our function is slightly intelligent in that it tries to extract a date from the prompt, but just hard codes the location.
  3. We use [FunctionModel][pydantic_ai.models.function.FunctionModel] to replace the agent's model with our custom function.

Overriding model via pytest fixtures

If you're writing lots of tests that all require model to be overridden, you can use pytest fixtures to override the model with [TestModel][pydantic_ai.models.test.TestModel] or [FunctionModel][pydantic_ai.models.function.FunctionModel] in a reusable way.

Here's an example of a fixture that overrides the model with TestModel:

import pytest
from weather_app import weather_agent

from pydantic_ai.models.test import TestModel


@pytest.fixture
def override_weather_agent():
    with weather_agent.override(model=TestModel()):
        yield


async def test_forecast(override_weather_agent: None):
    ...
    # test code here

Evals

"Evals" refers to evaluating a models performance for a specific application.

!!! danger "Warning" Unlike unit tests, evals are an emerging art/science; anyone who claims to know for sure exactly how your evals should be defined can safely be ignored.

Evals are generally more like benchmarks than unit tests, they never "pass" although they do "fail"; you care mostly about how they change over time.

Since evals need to be run against the real model, then can be slow and expensive to run, you generally won't want to run them in CI for every commit.

Measuring performance

The hardest part of evals is measuring how well the model has performed.

In some cases (e.g. an agent to generate SQL) there are simple, easy to run tests that can be used to measure performance (e.g. is the SQL valid? Does it return the right results? Does it return just the right results?).

In other cases (e.g. an agent that gives advice on quitting smoking) it can be very hard or impossible to make quantitative measures of performance — in the smoking case you'd really need to run a double-blind trial over months, then wait 40 years and observe health outcomes to know if changes to your prompt were an improvement.

There are a few different strategies you can use to measure performance:

  • End to end, self-contained tests — like the SQL example, we can test the final result of the agent near-instantly
  • Synthetic self-contained tests — writing unit test style checks that the output is as expected, checks like #!python 'chewing gum' in response, while these checks might seem simplistic they can be helpful, one nice characteristic is that it's easy to tell what's wrong when they fail
  • LLMs evaluating LLMs — using another models, or even the same model with a different prompt to evaluate the performance of the agent (like when the class marks each other's homework because the teacher has a hangover), while the downsides and complexities of this approach are obvious, some think it can be a useful tool in the right circumstances
  • Evals in prod — measuring the end results of the agent in production, then creating a quantitative measure of performance, so you can easily measure changes over time as you change the prompt or model used, logfire can be extremely useful in this case since you can write a custom query to measure the performance of your agent

System prompt customization

The system prompt is the developer's primary tool in controlling an agent's behavior, so it's often useful to be able to customise the system prompt and see how performance changes. This is particularly relevant when the system prompt contains a list of examples and you want to understand how changing that list affects the model's performance.

Let's assume we have the following app for running SQL generated from a user prompt (this examples omits a lot of details for brevity, see the SQL gen example for a more complete code):

import json
from pathlib import Path
from typing import Union

from pydantic_ai import Agent, RunContext

from fake_database import DatabaseConn


class SqlSystemPrompt:  # (1)!
    def __init__(
        self, examples: Union[list[dict[str, str]], None] = None, db: str = 'PostgreSQL'
    ):
        if examples is None:
            # if examples aren't provided, load them from file, this is the default
            with Path('examples.json').open('rb') as f:
                self.examples = json.load(f)
        else:
            self.examples = examples

        self.db = db

    def build_prompt(self) -> str:  # (2)!
        return f"""\
Given the following {self.db} table of records, your job is to
write a SQL query that suits the user's request.

Database schema:
CREATE TABLE records (
  ...
);

{''.join(self.format_example(example) for example in self.examples)}
"""

    @staticmethod
    def format_example(example: dict[str, str]) -> str:  # (3)!
        return f"""\
<example>
  <request>{example['request']}</request>
  <sql>{example['sql']}</sql>
</example>
"""


sql_agent = Agent(
    'gemini-1.5-flash',
    deps_type=SqlSystemPrompt,
)


@sql_agent.system_prompt
async def system_prompt(ctx: RunContext[SqlSystemPrompt]) -> str:
    return ctx.deps.build_prompt()


async def user_search(user_prompt: str) -> list[dict[str, str]]:
    """Search the database based on the user's prompts."""
    ...  # (4)!
    result = await sql_agent.run(user_prompt, deps=SqlSystemPrompt())
    conn = DatabaseConn()
    return await conn.execute(result.data)
  1. The SqlSystemPrompt class is used to build the system prompt, it can be customised with a list of examples and a database type. We implement this as a separate class passed as a dep to the agent so we can override both the inputs and the logic during evals via dependency injection.
  2. The build_prompt method constructs the system prompt from the examples and the database type.
  3. Some people think that LLMs are more likely to generate good responses if examples are formatted as XML as it's to identify the end of a string, see #93.
  4. In reality, you would have more logic here, making it impractical to run the agent independently of the wider application.

examples.json looks something like this:

request: show me error records with the tag "foobar"
response: SELECT * FROM records WHERE level = 'error' and 'foobar' = ANY(tags)
{
  "examples": [
    {
      "request": "Show me all records",
      "sql": "SELECT * FROM records;"
    },
    {
      "request": "Show me all records from 2021",
      "sql": "SELECT * FROM records WHERE date_trunc('year', date) = '2021-01-01';"
    },
    {
      "request": "show me error records with the tag 'foobar'",
      "sql": "SELECT * FROM records WHERE level = 'error' and 'foobar' = ANY(tags);"
    },
    ...
  ]
}

Now we want a way to quantify the success of the SQL generation so we can judge how changes to the agent affect its performance.

We can use [Agent.override][pydantic_ai.agent.Agent.override] to replace the system prompt with a custom one that uses a subset of examples, and then run the application code (in this case user_search). We also run the actual SQL from the examples and compare the "correct" result from the example SQL to the SQL generated by the agent. (We compare the results of running the SQL rather than the SQL itself since the SQL might be semantically equivalent but written in a different way).

To get a quantitative measure of performance, we assign points to each run as follows:

  • -100 points if the generated SQL is invalid
  • -1 point for each row returned by the agent (so returning lots of results is discouraged)
  • +5 points for each row returned by the agent that matches the expected result

We use 5-fold cross-validation to judge the performance of the agent using our existing set of examples.

import json
import statistics
from pathlib import Path
from itertools import chain

from fake_database import DatabaseConn, QueryError
from sql_app import sql_agent, SqlSystemPrompt, user_search


async def main():
    with Path('examples.json').open('rb') as f:
        examples = json.load(f)

    # split examples into 5 folds
    fold_size = len(examples) // 5
    folds = [examples[i : i + fold_size] for i in range(0, len(examples), fold_size)]
    conn = DatabaseConn()
    scores = []

    for i, fold in enumerate(folds, start=1):
        fold_score = 0
        # build all other folds into a list of examples
        other_folds = list(chain(*(f for j, f in enumerate(folds) if j != i)))
        # create a new system prompt with the other fold examples
        system_prompt = SqlSystemPrompt(examples=other_folds)

        # override the system prompt with the new one
        with sql_agent.override(deps=system_prompt):
            for case in fold:
                try:
                    agent_results = await user_search(case['request'])
                except QueryError as e:
                    print(f'Fold {i} {case}: {e}')
                    fold_score -= 100
                else:
                    # get the expected results using the SQL from this case
                    expected_results = await conn.execute(case['sql'])

                agent_ids = [r['id'] for r in agent_results]
                # each returned value has a score of -1
                fold_score -= len(agent_ids)
                expected_ids = {r['id'] for r in expected_results}

                # each return value that matches the expected value has a score of 3
                fold_score += 5 * len(set(agent_ids) & expected_ids)

        scores.append(fold_score)

    overall_score = statistics.mean(scores)
    print(f'Overall score: {overall_score:0.2f}')
    #> Overall score: 12.00

We can then change the prompt, the model, or the examples and see how the score changes over time.

output/repo_parser/github_repos/pydantic/pydantic-ai/docs/agents.md

Introduction

Agents are PydanticAI's primary interface for interacting with LLMs.

In some use cases a single Agent will control an entire application or component, but multiple agents can also interact to embody more complex workflows.

The [Agent][pydantic_ai.Agent] class has full API documentation, but conceptually you can think of an agent as a container for:

Component Description
System prompt(s) A set of instructions for the LLM written by the developer.
Function tool(s) Functions that the LLM may call to get information while generating a response.
Structured result type The structured datatype the LLM must return at the end of a run, if specified.
Dependency type constraint System prompt functions, tools, and result validators may all use dependencies when they're run.
LLM model Optional default LLM model associated with the agent. Can also be specified when running the agent.
Model Settings Optional default model settings to help fine tune requests. Can also be specified when running the agent.

In typing terms, agents are generic in their dependency and result types, e.g., an agent which required dependencies of type #!python Foobar and returned results of type #!python list[str] would have type Agent[Foobar, list[str]]. In practice, you shouldn't need to care about this, it should just mean your IDE can tell you when you have the right type, and if you choose to use static type checking it should work well with PydanticAI.

Here's a toy example of an agent that simulates a roulette wheel:

from pydantic_ai import Agent, RunContext

roulette_agent = Agent(  # (1)!
    'openai:gpt-4o',
    deps_type=int,
    result_type=bool,
    system_prompt=(
        'Use the `roulette_wheel` function to see if the '
        'customer has won based on the number they provide.'
    ),
)


@roulette_agent.tool
async def roulette_wheel(ctx: RunContext[int], square: int) -> str:  # (2)!
    """check if the square is a winner"""
    return 'winner' if square == ctx.deps else 'loser'


# Run the agent
success_number = 18  # (3)!
result = roulette_agent.run_sync('Put my money on square eighteen', deps=success_number)
print(result.data)  # (4)!
#> True

result = roulette_agent.run_sync('I bet five is the winner', deps=success_number)
print(result.data)
#> False
  1. Create an agent, which expects an integer dependency and returns a boolean result. This agent will have type #!python Agent[int, bool].
  2. Define a tool that checks if the square is a winner. Here [RunContext][pydantic_ai.tools.RunContext] is parameterized with the dependency type int; if you got the dependency type wrong you'd get a typing error.
  3. In reality, you might want to use a random number here e.g. random.randint(0, 36).
  4. result.data will be a boolean indicating if the square is a winner. Pydantic performs the result validation, it'll be typed as a bool since its type is derived from the result_type generic parameter of the agent.

!!! tip "Agents are designed for reuse, like FastAPI Apps" Agents are intended to be instantiated once (frequently as module globals) and reused throughout your application, similar to a small [FastAPI][fastapi.FastAPI] app or an [APIRouter][fastapi.APIRouter].

Running Agents

There are three ways to run an agent:

  1. [agent.run()][pydantic_ai.Agent.run] — a coroutine which returns a [RunResult][pydantic_ai.result.RunResult] containing a completed response
  2. [agent.run_sync()][pydantic_ai.Agent.run_sync] — a plain, synchronous function which returns a [RunResult][pydantic_ai.result.RunResult] containing a completed response (internally, this just calls loop.run_until_complete(self.run()))
  3. [agent.run_stream()][pydantic_ai.Agent.run_stream] — a coroutine which returns a [StreamedRunResult][pydantic_ai.result.StreamedRunResult], which contains methods to stream a response as an async iterable

Here's a simple example demonstrating all three:

from pydantic_ai import Agent

agent = Agent('openai:gpt-4o')

result_sync = agent.run_sync('What is the capital of Italy?')
print(result_sync.data)
#> Rome


async def main():
    result = await agent.run('What is the capital of France?')
    print(result.data)
    #> Paris

    async with agent.run_stream('What is the capital of the UK?') as response:
        print(await response.get_data())
        #> London

(This example is complete, it can be run "as is")

You can also pass messages from previous runs to continue a conversation or provide context, as described in Messages and Chat History.

Additional Configuration

Usage Limits

PydanticAI offers a [UsageLimits][pydantic_ai.usage.UsageLimits] structure to help you limit your usage (tokens and/or requests) on model runs.

You can apply these settings by passing the usage_limits argument to the run{_sync,_stream} functions.

Consider the following example, where we limit the number of response tokens:

from pydantic_ai import Agent
from pydantic_ai.exceptions import UsageLimitExceeded
from pydantic_ai.usage import UsageLimits

agent = Agent('claude-3-5-sonnet-latest')

result_sync = agent.run_sync(
    'What is the capital of Italy? Answer with just the city.',
    usage_limits=UsageLimits(response_tokens_limit=10),
)
print(result_sync.data)
#> Rome
print(result_sync.usage())
"""
Usage(requests=1, request_tokens=62, response_tokens=1, total_tokens=63, details=None)
"""

try:
    result_sync = agent.run_sync(
        'What is the capital of Italy? Answer with a paragraph.',
        usage_limits=UsageLimits(response_tokens_limit=10),
    )
except UsageLimitExceeded as e:
    print(e)
    #> Exceeded the response_tokens_limit of 10 (response_tokens=32)

Restricting the number of requests can be useful in preventing infinite loops or excessive tool calling:

from typing_extensions import TypedDict

from pydantic_ai import Agent, ModelRetry
from pydantic_ai.exceptions import UsageLimitExceeded
from pydantic_ai.usage import UsageLimits


class NeverResultType(TypedDict):
    """
    Never ever coerce data to this type.
    """

    never_use_this: str


agent = Agent(
    'claude-3-5-sonnet-latest',
    result_type=NeverResultType,
    system_prompt='Any time you get a response, call the `infinite_retry_tool` to produce another response.',
)


@agent.tool_plain(retries=5)  # (1)!
def infinite_retry_tool() -> int:
    raise ModelRetry('Please try again.')


try:
    result_sync = agent.run_sync(
        'Begin infinite retry loop!', usage_limits=UsageLimits(request_limit=3)  # (2)!
    )
except UsageLimitExceeded as e:
    print(e)
    #> The next request would exceed the request_limit of 3
  1. This tool has the ability to retry 5 times before erroring, simulating a tool that might get stuck in a loop.
  2. This run will error after 3 requests, preventing the infinite tool calling.

!!! note This is especially relevant if you're registered a lot of tools, request_limit can be used to prevent the model from choosing to make too many of these calls.

Model (Run) Settings

PydanticAI offers a [settings.ModelSettings][pydantic_ai.settings.ModelSettings] structure to help you fine tune your requests. This structure allows you to configure common parameters that influence the model's behavior, such as temperature, max_tokens, timeout, and more.

There are two ways to apply these settings:

  1. Passing to run{_sync,_stream} functions via the model_settings argument. This allows for fine-tuning on a per-request basis.
  2. Setting during [Agent][pydantic_ai.agent.Agent] initialization via the model_settings argument. These settings will be applied by default to all subsequent run calls using said agent. However, model_settings provided during a specific run call will override the agent's default settings.

For example, if you'd like to set the temperature setting to 0.0 to ensure less random behavior, you can do the following:

from pydantic_ai import Agent

agent = Agent('openai:gpt-4o')

result_sync = agent.run_sync(
    'What is the capital of Italy?', model_settings={'temperature': 0.0}
)
print(result_sync.data)
#> Rome

Runs vs. Conversations

An agent run might represent an entire conversation — there's no limit to how many messages can be exchanged in a single run. However, a conversation might also be composed of multiple runs, especially if you need to maintain state between separate interactions or API calls.

Here's an example of a conversation comprised of multiple runs:

from pydantic_ai import Agent

agent = Agent('openai:gpt-4o')

# First run
result1 = agent.run_sync('Who was Albert Einstein?')
print(result1.data)
#> Albert Einstein was a German-born theoretical physicist.

# Second run, passing previous messages
result2 = agent.run_sync(
    'What was his most famous equation?',
    message_history=result1.new_messages(),  # (1)!
)
print(result2.data)
#> Albert Einstein's most famous equation is (E = mc^2).
  1. Continue the conversation; without message_history the model would not know who "his" was referring to.

(This example is complete, it can be run "as is")

Type safe by design {#static-type-checking}

PydanticAI is designed to work well with static type checkers, like mypy and pyright.

!!! tip "Typing is (somewhat) optional" PydanticAI is designed to make type checking as useful as possible for you if you choose to use it, but you don't have to use types everywhere all the time.

That said, because PydanticAI uses Pydantic, and Pydantic uses type hints as the definition for schema and validation, some types (specifically type hints on parameters to tools, and the `result_type` arguments to [`Agent`][pydantic_ai.Agent]) are used at runtime.

We (the library developers) have messed up if type hints are confusing you more than helping you, if you find this, please create an [issue](https://github.com/pydantic/pydantic-ai/issues) explaining what's annoying you!

In particular, agents are generic in both the type of their dependencies and the type of results they return, so you can use the type hints to ensure you're using the right types.

Consider the following script with type mistakes:

from dataclasses import dataclass

from pydantic_ai import Agent, RunContext


@dataclass
class User:
    name: str


agent = Agent(
    'test',
    deps_type=User,  # (1)!
    result_type=bool,
)


@agent.system_prompt
def add_user_name(ctx: RunContext[str]) -> str:  # (2)!
    return f"The user's name is {ctx.deps}."


def foobar(x: bytes) -> None:
    pass


result = agent.run_sync('Does their name start with "A"?', deps=User('Anne'))
foobar(result.data)  # (3)!
  1. The agent is defined as expecting an instance of User as deps.
  2. But here add_user_name is defined as taking a str as the dependency, not a User.
  3. Since the agent is defined as returning a bool, this will raise a type error since foobar expects bytes.

Running mypy on this will give the following output:

➤ uv run mypy type_mistakes.py
type_mistakes.py:18: error: Argument 1 to "system_prompt" of "Agent" has incompatible type "Callable[[RunContext[str]], str]"; expected "Callable[[RunContext[User]], str]"  [arg-type]
type_mistakes.py:28: error: Argument 1 to "foobar" has incompatible type "bool"; expected "bytes"  [arg-type]
Found 2 errors in 1 file (checked 1 source file)

Running pyright would identify the same issues.

System Prompts

System prompts might seem simple at first glance since they're just strings (or sequences of strings that are concatenated), but crafting the right system prompt is key to getting the model to behave as you want.

Generally, system prompts fall into two categories:

  1. Static system prompts: These are known when writing the code and can be defined via the system_prompt parameter of the [Agent constructor][pydantic_ai.Agent.init].
  2. Dynamic system prompts: These depend in some way on context that isn't known until runtime, and should be defined via functions decorated with [@agent.system_prompt][pydantic_ai.Agent.system_prompt].

You can add both to a single agent; they're appended in the order they're defined at runtime.

Here's an example using both types of system prompts:

from datetime import date

from pydantic_ai import Agent, RunContext

agent = Agent(
    'openai:gpt-4o',
    deps_type=str,  # (1)!
    system_prompt="Use the customer's name while replying to them.",  # (2)!
)


@agent.system_prompt  # (3)!
def add_the_users_name(ctx: RunContext[str]) -> str:
    return f"The user's name is {ctx.deps}."


@agent.system_prompt
def add_the_date() -> str:  # (4)!
    return f'The date is {date.today()}.'


result = agent.run_sync('What is the date?', deps='Frank')
print(result.data)
#> Hello Frank, the date today is 2032-01-02.
  1. The agent expects a string dependency.
  2. Static system prompt defined at agent creation time.
  3. Dynamic system prompt defined via a decorator with [RunContext][pydantic_ai.tools.RunContext], this is called just after run_sync, not when the agent is created, so can benefit from runtime information like the dependencies used on that run.
  4. Another dynamic system prompt, system prompts don't have to have the RunContext parameter.

(This example is complete, it can be run "as is")

Reflection and self-correction

Validation errors from both function tool parameter validation and structured result validation can be passed back to the model with a request to retry.

You can also raise [ModelRetry][pydantic_ai.exceptions.ModelRetry] from within a tool or result validator function to tell the model it should retry generating a response.

  • The default retry count is 1 but can be altered for the [entire agent][pydantic_ai.Agent.init], a [specific tool][pydantic_ai.Agent.tool], or a [result validator][pydantic_ai.Agent.init].
  • You can access the current retry count from within a tool or result validator via [ctx.retry][pydantic_ai.tools.RunContext].

Here's an example:

from pydantic import BaseModel

from pydantic_ai import Agent, RunContext, ModelRetry

from fake_database import DatabaseConn


class ChatResult(BaseModel):
    user_id: int
    message: str


agent = Agent(
    'openai:gpt-4o',
    deps_type=DatabaseConn,
    result_type=ChatResult,
)


@agent.tool(retries=2)
def get_user_by_name(ctx: RunContext[DatabaseConn], name: str) -> int:
    """Get a user's ID from their full name."""
    print(name)
    #> John
    #> John Doe
    user_id = ctx.deps.users.get(name=name)
    if user_id is None:
        raise ModelRetry(
            f'No user found with name {name!r}, remember to provide their full name'
        )
    return user_id


result = agent.run_sync(
    'Send a message to John Doe asking for coffee next week', deps=DatabaseConn()
)
print(result.data)
"""
user_id=123 message='Hello John, would you be free for coffee sometime next week? Let me know what works for you!'
"""

Model errors

If models behave unexpectedly (e.g., the retry limit is exceeded, or their API returns 503), agent runs will raise [UnexpectedModelBehavior][pydantic_ai.exceptions.UnexpectedModelBehavior].

In these cases, [capture_run_messages][pydantic_ai.capture_run_messages] can be used to access the messages exchanged during the run to help diagnose the issue.

from pydantic_ai import Agent, ModelRetry, UnexpectedModelBehavior, capture_run_messages

agent = Agent('openai:gpt-4o')


@agent.tool_plain
def calc_volume(size: int) -> int:  # (1)!
    if size == 42:
        return size**3
    else:
        raise ModelRetry('Please try again.')


with capture_run_messages() as messages:  # (2)!
    try:
        result = agent.run_sync('Please get me the volume of a box with size 6.')
    except UnexpectedModelBehavior as e:
        print('An error occurred:', e)
        #> An error occurred: Tool exceeded max retries count of 1
        print('cause:', repr(e.__cause__))
        #> cause: ModelRetry('Please try again.')
        print('messages:', messages)
        """
        messages:
        [
            ModelRequest(
                parts=[
                    UserPromptPart(
                        content='Please get me the volume of a box with size 6.',
                        timestamp=datetime.datetime(...),
                        part_kind='user-prompt',
                    )
                ],
                kind='request',
            ),
            ModelResponse(
                parts=[
                    ToolCallPart(
                        tool_name='calc_volume',
                        args=ArgsDict(args_dict={'size': 6}),
                        tool_call_id=None,
                        part_kind='tool-call',
                    )
                ],
                timestamp=datetime.datetime(...),
                kind='response',
            ),
            ModelRequest(
                parts=[
                    RetryPromptPart(
                        content='Please try again.',
                        tool_name='calc_volume',
                        tool_call_id=None,
                        timestamp=datetime.datetime(...),
                        part_kind='retry-prompt',
                    )
                ],
                kind='request',
            ),
            ModelResponse(
                parts=[
                    ToolCallPart(
                        tool_name='calc_volume',
                        args=ArgsDict(args_dict={'size': 6}),
                        tool_call_id=None,
                        part_kind='tool-call',
                    )
                ],
                timestamp=datetime.datetime(...),
                kind='response',
            ),
        ]
        """
    else:
        print(result.data)
  1. Define a tool that will raise ModelRetry repeatedly in this case.
  2. [capture_run_messages][pydantic_ai.capture_run_messages] is used to capture the messages exchanged during the run.

(This example is complete, it can be run "as is")

!!! note If you call [run][pydantic_ai.Agent.run], [run_sync][pydantic_ai.Agent.run_sync], or [run_stream][pydantic_ai.Agent.run_stream] more than once within a single capture_run_messages context, messages will represent the messages exchanged during the first call only.

output/repo_parser/github_repos/pydantic/pydantic-ai/docs/troubleshooting.md

Troubleshooting

Below are suggestions on how to fix some common errors you might encounter while using PydanticAI. If the issue you're experiencing is not listed below or addressed in the documentation, please feel free to ask in the Pydantic Slack or create an issue on GitHub.

Jupyter Notebook Errors

RuntimeError: This event loop is already running

This error is caused by conflicts between the event loops in Jupyter notebook and PydanticAI's. One way to manage these conflicts is by using nest-asyncio. Namely, before you execute any agent runs, do the following:

import nest_asyncio

nest_asyncio.apply()

Note: This fix also applies to Google Colab.

API Key Configuration

UserError: API key must be provided or set in the [MODEL]_API_KEY environment variable

If you're running into issues with setting the API key for your model, visit the Models page to learn more about how to set an environment variable and/or pass in an api_key argument.

output/repo_parser/github_repos/pydantic/pydantic-ai/docs/message-history.md

Messages and chat history

PydanticAI provides access to messages exchanged during an agent run. These messages can be used both to continue a coherent conversation, and to understand how an agent performed.

Accessing Messages from Results

After running an agent, you can access the messages exchanged during that run from the result object.

Both [RunResult][pydantic_ai.result.RunResult] (returned by [Agent.run][pydantic_ai.Agent.run], [Agent.run_sync][pydantic_ai.Agent.run_sync]) and [StreamedRunResult][pydantic_ai.result.StreamedRunResult] (returned by [Agent.run_stream][pydantic_ai.Agent.run_stream]) have the following methods:

  • [all_messages()][pydantic_ai.result.RunResult.all_messages]: returns all messages, including messages from prior runs. There's also a variant that returns JSON bytes, [all_messages_json()][pydantic_ai.result.RunResult.all_messages_json].
  • [new_messages()][pydantic_ai.result.RunResult.new_messages]: returns only the messages from the current run. There's also a variant that returns JSON bytes, [new_messages_json()][pydantic_ai.result.RunResult.new_messages_json].

!!! info "StreamedRunResult and complete messages" On [StreamedRunResult][pydantic_ai.result.StreamedRunResult], the messages returned from these methods will only include the final result message once the stream has finished.

E.g. you've awaited one of the following coroutines:

* [`StreamedRunResult.stream()`][pydantic_ai.result.StreamedRunResult.stream]
* [`StreamedRunResult.stream_text()`][pydantic_ai.result.StreamedRunResult.stream_text]
* [`StreamedRunResult.stream_structured()`][pydantic_ai.result.StreamedRunResult.stream_structured]
* [`StreamedRunResult.get_data()`][pydantic_ai.result.StreamedRunResult.get_data]

**Note:** The final result message will NOT be added to result messages if you use [`.stream_text(delta=True)`][pydantic_ai.result.StreamedRunResult.stream_text] since in this case the result content is never built as one string.

Example of accessing methods on a [RunResult][pydantic_ai.result.RunResult] :

from pydantic_ai import Agent

agent = Agent('openai:gpt-4o', system_prompt='Be a helpful assistant.')

result = agent.run_sync('Tell me a joke.')
print(result.data)
#> Did you hear about the toothpaste scandal? They called it Colgate.

# all messages from the run
print(result.all_messages())
"""
[
    ModelRequest(
        parts=[
            SystemPromptPart(
                content='Be a helpful assistant.', part_kind='system-prompt'
            ),
            UserPromptPart(
                content='Tell me a joke.',
                timestamp=datetime.datetime(...),
                part_kind='user-prompt',
            ),
        ],
        kind='request',
    ),
    ModelResponse(
        parts=[
            TextPart(
                content='Did you hear about the toothpaste scandal? They called it Colgate.',
                part_kind='text',
            )
        ],
        timestamp=datetime.datetime(...),
        kind='response',
    ),
]
"""

(This example is complete, it can be run "as is")

Example of accessing methods on a [StreamedRunResult][pydantic_ai.result.StreamedRunResult] :

from pydantic_ai import Agent

agent = Agent('openai:gpt-4o', system_prompt='Be a helpful assistant.')


async def main():
    async with agent.run_stream('Tell me a joke.') as result:
        # incomplete messages before the stream finishes
        print(result.all_messages())
        """
        [
            ModelRequest(
                parts=[
                    SystemPromptPart(
                        content='Be a helpful assistant.', part_kind='system-prompt'
                    ),
                    UserPromptPart(
                        content='Tell me a joke.',
                        timestamp=datetime.datetime(...),
                        part_kind='user-prompt',
                    ),
                ],
                kind='request',
            )
        ]
        """

        async for text in result.stream():
            print(text)
            #> Did you hear
            #> Did you hear about the toothpaste
            #> Did you hear about the toothpaste scandal? They called
            #> Did you hear about the toothpaste scandal? They called it Colgate.

        # complete messages once the stream finishes
        print(result.all_messages())
        """
        [
            ModelRequest(
                parts=[
                    SystemPromptPart(
                        content='Be a helpful assistant.', part_kind='system-prompt'
                    ),
                    UserPromptPart(
                        content='Tell me a joke.',
                        timestamp=datetime.datetime(...),
                        part_kind='user-prompt',
                    ),
                ],
                kind='request',
            ),
            ModelResponse(
                parts=[
                    TextPart(
                        content='Did you hear about the toothpaste scandal? They called it Colgate.',
                        part_kind='text',
                    )
                ],
                timestamp=datetime.datetime(...),
                kind='response',
            ),
        ]
        """

(This example is complete, it can be run "as is")

Using Messages as Input for Further Agent Runs

The primary use of message histories in PydanticAI is to maintain context across multiple agent runs.

To use existing messages in a run, pass them to the message_history parameter of [Agent.run][pydantic_ai.Agent.run], [Agent.run_sync][pydantic_ai.Agent.run_sync] or [Agent.run_stream][pydantic_ai.Agent.run_stream].

If message_history is set and not empty, a new system prompt is not generated — we assume the existing message history includes a system prompt.

from pydantic_ai import Agent

agent = Agent('openai:gpt-4o', system_prompt='Be a helpful assistant.')

result1 = agent.run_sync('Tell me a joke.')
print(result1.data)
#> Did you hear about the toothpaste scandal? They called it Colgate.

result2 = agent.run_sync('Explain?', message_history=result1.new_messages())
print(result2.data)
#> This is an excellent joke invent by Samuel Colvin, it needs no explanation.

print(result2.all_messages())
"""
[
    ModelRequest(
        parts=[
            SystemPromptPart(
                content='Be a helpful assistant.', part_kind='system-prompt'
            ),
            UserPromptPart(
                content='Tell me a joke.',
                timestamp=datetime.datetime(...),
                part_kind='user-prompt',
            ),
        ],
        kind='request',
    ),
    ModelResponse(
        parts=[
            TextPart(
                content='Did you hear about the toothpaste scandal? They called it Colgate.',
                part_kind='text',
            )
        ],
        timestamp=datetime.datetime(...),
        kind='response',
    ),
    ModelRequest(
        parts=[
            UserPromptPart(
                content='Explain?',
                timestamp=datetime.datetime(...),
                part_kind='user-prompt',
            )
        ],
        kind='request',
    ),
    ModelResponse(
        parts=[
            TextPart(
                content='This is an excellent joke invent by Samuel Colvin, it needs no explanation.',
                part_kind='text',
            )
        ],
        timestamp=datetime.datetime(...),
        kind='response',
    ),
]
"""

(This example is complete, it can be run "as is")

Other ways of using messages

Since messages are defined by simple dataclasses, you can manually create and manipulate, e.g. for testing.

The message format is independent of the model used, so you can use messages in different agents, or the same agent with different models.

from pydantic_ai import Agent

agent = Agent('openai:gpt-4o', system_prompt='Be a helpful assistant.')

result1 = agent.run_sync('Tell me a joke.')
print(result1.data)
#> Did you hear about the toothpaste scandal? They called it Colgate.

result2 = agent.run_sync(
    'Explain?', model='gemini-1.5-pro', message_history=result1.new_messages()
)
print(result2.data)
#> This is an excellent joke invent by Samuel Colvin, it needs no explanation.

print(result2.all_messages())
"""
[
    ModelRequest(
        parts=[
            SystemPromptPart(
                content='Be a helpful assistant.', part_kind='system-prompt'
            ),
            UserPromptPart(
                content='Tell me a joke.',
                timestamp=datetime.datetime(...),
                part_kind='user-prompt',
            ),
        ],
        kind='request',
    ),
    ModelResponse(
        parts=[
            TextPart(
                content='Did you hear about the toothpaste scandal? They called it Colgate.',
                part_kind='text',
            )
        ],
        timestamp=datetime.datetime(...),
        kind='response',
    ),
    ModelRequest(
        parts=[
            UserPromptPart(
                content='Explain?',
                timestamp=datetime.datetime(...),
                part_kind='user-prompt',
            )
        ],
        kind='request',
    ),
    ModelResponse(
        parts=[
            TextPart(
                content='This is an excellent joke invent by Samuel Colvin, it needs no explanation.',
                part_kind='text',
            )
        ],
        timestamp=datetime.datetime(...),
        kind='response',
    ),
]
"""

Examples

For a more complete example of using messages in conversations, see the chat app example.

output/repo_parser/github_repos/pydantic/pydantic-ai/docs/logfire.md

Debugging and Monitoring

Applications that use LLMs have some challenges that are well known and understood: LLMs are slow, unreliable and expensive.

These applications also have some challenges that most developers have encountered much less often: LLMs are fickle and non-deterministic. Subtle changes in a prompt can completely change a model's performance, and there's no EXPLAIN query you can run to understand why.

!!! danger "Warning" From a software engineers point of view, you can think of LLMs as the worst database you've ever heard of, but worse.

If LLMs weren't so bloody useful, we'd never touch them.

To build successful applications with LLMs, we need new tools to understand both model performance, and the behavior of applications that rely on them.

LLM Observability tools that just let you understand how your model is performing are useless: making API calls to an LLM is easy, it's building that into an application that's hard.

Pydantic Logfire

Pydantic Logfire is an observability platform developed by the team who created and maintain Pydantic and PydanticAI. Logfire aims to let you understand your entire application: Gen AI, classic predictive AI, HTTP traffic, database queries and everything else a modern application needs.

!!! tip "Pydantic Logfire is a commercial product" Logfire is a commercially supported, hosted platform with an extremely generous and perpetual free tier. You can sign up and start using Logfire in a couple of minutes.

PydanticAI has built-in (but optional) support for Logfire via the logfire-api no-op package.

That means if the logfire package is installed and configured, detailed information about agent runs is sent to Logfire. But if the logfire package is not installed, there's virtually no overhead and nothing is sent.

Here's an example showing details of running the Weather Agent in Logfire:

Weather Agent Logfire

Using Logfire

To use logfire, you'll need a logfire account, and logfire installed:

pip/uv-add 'pydantic-ai[logfire]'

Then authenticate your local environment with logfire:

py-cli logfire auth

And configure a project to send data to:

py-cli logfire projects new

(Or use an existing project with logfire projects use)

The last step is to add logfire to your code:

import logfire

logfire.configure()

The logfire documentation has more details on how to use logfire, including how to instrument other libraries like Pydantic, HTTPX and FastAPI.

Since Logfire is build on OpenTelemetry, you can use the Logfire Python SDK to send data to any OpenTelemetry collector.

Once you have logfire set up, there are two primary ways it can help you understand your application:

  • Debugging — Using the live view to see what's happening in your application in real-time.
  • Monitoring — Using SQL and dashboards to observe the behavior of your application, Logfire is effectively a SQL database that stores information about how your application is running.

Debugging

To demonstrate how Logfire can let you visualise the flow of a PydanticAI run, here's the view you get from Logfire while running the chat app examples:

{{ video('a764aff5840534dc77eba7d028707bfa', 25) }}

Monitoring Performance

We can also query data with SQL in Logfire to monitor the performance of an application. Here's a real world example of using Logfire to monitor PydanticAI runs inside Logfire itself:

Logfire monitoring PydanticAI

output/repo_parser/github_repos/pydantic/pydantic-ai/docs/api/tools.md

pydantic_ai.tools

::: pydantic_ai.tools

output/repo_parser/github_repos/pydantic/pydantic-ai/docs/api/settings.md

pydantic_ai.settings

::: pydantic_ai.settings options: inherited_members: true members: - ModelSettings - UsageLimits

output/repo_parser/github_repos/pydantic/pydantic-ai/docs/api/format_as_xml.md

pydantic_ai.format_as_xml

::: pydantic_ai.format_as_xml

output/repo_parser/github_repos/pydantic/pydantic-ai/docs/api/exceptions.md

pydantic_ai.exceptions

::: pydantic_ai.exceptions

output/repo_parser/github_repos/pydantic/pydantic-ai/docs/api/messages.md

pydantic_ai.messages

The structure of [ModelMessage][pydantic_ai.messages.ModelMessage] can be shown as a graph:

graph RL
    SystemPromptPart(SystemPromptPart) --- ModelRequestPart
    UserPromptPart(UserPromptPart) --- ModelRequestPart
    ToolReturnPart(ToolReturnPart) --- ModelRequestPart
    RetryPromptPart(RetryPromptPart) --- ModelRequestPart
    TextPart(TextPart) --- ModelResponsePart
    ToolCallPart(ToolCallPart) --- ModelResponsePart
    ModelRequestPart("ModelRequestPart<br>(Union)") --- ModelRequest
    ModelRequest("ModelRequest(parts=list[...])") --- ModelMessage
    ModelResponsePart("ModelResponsePart<br>(Union)") --- ModelResponse
    ModelResponse("ModelResponse(parts=list[...])") --- ModelMessage("ModelMessage<br>(Union)")
Loading

::: pydantic_ai.messages

output/repo_parser/github_repos/pydantic/pydantic-ai/docs/api/agent.md

pydantic_ai.agent

::: pydantic_ai.agent

output/repo_parser/github_repos/pydantic/pydantic-ai/docs/api/result.md

pydantic_ai.result

::: pydantic_ai.result options: inherited_members: true

output/repo_parser/github_repos/pydantic/pydantic-ai/docs/api/usage.md

pydantic_ai.usage

::: pydantic_ai.usage

output/repo_parser/github_repos/pydantic/pydantic-ai/docs/api/models/openai.md

pydantic_ai.models.openai

Setup

For details on how to set up authentication with this model, see model configuration for OpenAI.

::: pydantic_ai.models.openai

output/repo_parser/github_repos/pydantic/pydantic-ai/docs/api/models/vertexai.md

pydantic_ai.models.vertexai

Custom interface to the *-aiplatform.googleapis.com API for Gemini models.

This model uses [GeminiAgentModel][pydantic_ai.models.gemini.GeminiAgentModel] with just the URL and auth method changed from [GeminiModel][pydantic_ai.models.gemini.GeminiModel], it relies on the VertexAI generateContent and streamGenerateContent function endpoints having the same schemas as the equivalent [Gemini endpoints][pydantic_ai.models.gemini.GeminiModel].

Setup

For details on how to set up authentication with this model as well as a comparison with the generativelanguage.googleapis.com API used by [GeminiModel][pydantic_ai.models.gemini.GeminiModel], see model configuration for Gemini via VertexAI.

Example Usage

With the default google project already configured in your environment using "application default credentials":

from pydantic_ai import Agent
from pydantic_ai.models.vertexai import VertexAIModel

model = VertexAIModel('gemini-1.5-flash')
agent = Agent(model)
result = agent.run_sync('Tell me a joke.')
print(result.data)
#> Did you hear about the toothpaste scandal? They called it Colgate.

Or using a service account JSON file:

from pydantic_ai import Agent
from pydantic_ai.models.vertexai import VertexAIModel

model = VertexAIModel(
    'gemini-1.5-flash',
    service_account_file='path/to/service-account.json',
)
agent = Agent(model)
result = agent.run_sync('Tell me a joke.')
print(result.data)
#> Did you hear about the toothpaste scandal? They called it Colgate.

::: pydantic_ai.models.vertexai

output/repo_parser/github_repos/pydantic/pydantic-ai/docs/api/models/gemini.md

pydantic_ai.models.gemini

Custom interface to the generativelanguage.googleapis.com API using HTTPX and Pydantic.

The Google SDK for interacting with the generativelanguage.googleapis.com API google-generativeai reads like it was written by a Java developer who thought they knew everything about OOP, spent 30 minutes trying to learn Python, gave up and decided to build the library to prove how horrible Python is. It also doesn't use httpx for HTTP requests, and tries to implement tool calling itself, but doesn't use Pydantic or equivalent for validation.

We therefore implement support for the API directly.

Despite these shortcomings, the Gemini model is actually quite powerful and very fast.

Setup

For details on how to set up authentication with this model, see model configuration for Gemini.

::: pydantic_ai.models.gemini

output/repo_parser/github_repos/pydantic/pydantic-ai/docs/api/models/mistral.md

pydantic_ai.models.mistral

Setup

For details on how to set up authentication with this model, see model configuration for Mistral.

::: pydantic_ai.models.mistral

output/repo_parser/github_repos/pydantic/pydantic-ai/docs/api/models/anthropic.md

pydantic_ai.models.anthropic

Setup

For details on how to set up authentication with this model, see model configuration for Anthropic.

::: pydantic_ai.models.anthropic

output/repo_parser/github_repos/pydantic/pydantic-ai/docs/api/models/groq.md

pydantic_ai.models.groq

Setup

For details on how to set up authentication with this model, see model configuration for Groq.

::: pydantic_ai.models.groq

output/repo_parser/github_repos/pydantic/pydantic-ai/docs/api/models/function.md

pydantic_ai.models.function

A model controlled by a local function.

[FunctionModel][pydantic_ai.models.function.FunctionModel] is similar to TestModel, but allows greater control over the model's behavior.

Its primary use case is for more advanced unit testing than is possible with TestModel.

Here's a minimal example:

from pydantic_ai import Agent
from pydantic_ai.messages import ModelMessage, ModelResponse
from pydantic_ai.models.function import FunctionModel, AgentInfo

my_agent = Agent('openai:gpt-4o')


async def model_function(
    messages: list[ModelMessage], info: AgentInfo
) -> ModelResponse:
    print(messages)
    """
    [
        ModelRequest(
            parts=[
                UserPromptPart(
                    content='Testing my agent...',
                    timestamp=datetime.datetime(...),
                    part_kind='user-prompt',
                )
            ],
            kind='request',
        )
    ]
    """
    print(info)
    """
    AgentInfo(
        function_tools=[], allow_text_result=True, result_tools=[], model_settings=None
    )
    """
    return ModelResponse.from_text('hello world')


async def test_my_agent():
    """Unit test for my_agent, to be run by pytest."""
    with my_agent.override(model=FunctionModel(model_function)):
        result = await my_agent.run('Testing my agent...')
        assert result.data == 'hello world'

See Unit testing with FunctionModel for detailed documentation.

::: pydantic_ai.models.function

output/repo_parser/github_repos/pydantic/pydantic-ai/docs/api/models/test.md

pydantic_ai.models.test

Utility model for quickly testing apps built with PydanticAI.

Here's a minimal example:

from pydantic_ai import Agent
from pydantic_ai.models.test import TestModel

my_agent = Agent('openai:gpt-4o', system_prompt='...')


async def test_my_agent():
    """Unit test for my_agent, to be run by pytest."""
    m = TestModel()
    with my_agent.override(model=m):
        result = await my_agent.run('Testing my agent...')
        assert result.data == 'success (no tool calls)'
    assert m.agent_model_function_tools == []

See Unit testing with TestModel for detailed documentation.

::: pydantic_ai.models.test

output/repo_parser/github_repos/pydantic/pydantic-ai/docs/api/models/base.md

pydantic_ai.models

::: pydantic_ai.models options: members: - KnownModelName - Model - AgentModel - AbstractToolDefinition - StreamTextResponse - StreamStructuredResponse - ALLOW_MODEL_REQUESTS - check_allow_model_requests - override_allow_model_requests

output/repo_parser/github_repos/pydantic/pydantic-ai/docs/api/models/ollama.md

pydantic_ai.models.ollama

Setup

For details on how to set up authentication with this model, see model configuration for Ollama.

Example local usage

With ollama installed, you can run the server with the model you want to use:

ollama run llama3.2

(this will pull the llama3.2 model if you don't already have it downloaded)

Then run your code, here's a minimal example:

from pydantic import BaseModel

from pydantic_ai import Agent


class CityLocation(BaseModel):
    city: str
    country: str


agent = Agent('ollama:llama3.2', result_type=CityLocation)

result = agent.run_sync('Where were the olympics held in 2012?')
print(result.data)
#> city='London' country='United Kingdom'
print(result.usage())
"""
Usage(requests=1, request_tokens=57, response_tokens=8, total_tokens=65, details=None)
"""

Example using a remote server

from pydantic import BaseModel

from pydantic_ai import Agent
from pydantic_ai.models.ollama import OllamaModel

ollama_model = OllamaModel(
    model_name='qwen2.5-coder:7b',  # (1)!
    base_url='http://192.168.1.74:11434/v1',  # (2)!
)


class CityLocation(BaseModel):
    city: str
    country: str


agent = Agent(model=ollama_model, result_type=CityLocation)

result = agent.run_sync('Where were the olympics held in 2012?')
print(result.data)
#> city='London' country='United Kingdom'
print(result.usage())
"""
Usage(requests=1, request_tokens=57, response_tokens=8, total_tokens=65, details=None)
"""
  1. The name of the model running on the remote server
  2. The url of the remote server

See [OllamaModel][pydantic_ai.models.ollama.OllamaModel] for more information

::: pydantic_ai.models.ollama

output/repo_parser/github_repos/pydantic/pydantic-ai/docs/examples/chat-app.md

Chat App with FastAPI

Simple chat app example build with FastAPI.

Demonstrates:

This demonstrates storing chat history between requests and using it to give the model context for new responses.

Most of the complex logic here is between chat_app.py which streams the response to the browser, and chat_app.ts which renders messages in the browser.

Running the Example

With dependencies installed and environment variables set, run:

python/uv-run -m pydantic_ai_examples.chat_app

Then open the app at localhost:8000.

TODO screenshot.

Example Code

Python code that runs the chat app:

#! examples/pydantic_ai_examples/chat_app.py

Simple HTML page to render the app:

#! examples/pydantic_ai_examples/chat_app.html

TypeScript to handle rendering the messages, to keep this simple (and at the risk of offending frontend developers) the typescript code is passed to the browser as plain text and transpiled in the browser.

#! examples/pydantic_ai_examples/chat_app.ts

output/repo_parser/github_repos/pydantic/pydantic-ai/docs/examples/bank-support.md

Small but complete example of using PydanticAI to build a support agent for a bank.

Demonstrates:

Running the Example

With dependencies installed and environment variables set, run:

python/uv-run -m pydantic_ai_examples.bank_support

(or PYDANTIC_AI_MODEL=gemini-1.5-flash ...)

Example Code

#! examples/pydantic_ai_examples/bank_support.py

output/repo_parser/github_repos/pydantic/pydantic-ai/docs/examples/weather-agent.md

Example of PydanticAI with multiple tools which the LLM needs to call in turn to answer a question.

Demonstrates:

In this case the idea is a "weather" agent — the user can ask for the weather in multiple locations, the agent will use the get_lat_lng tool to get the latitude and longitude of the locations, then use the get_weather tool to get the weather for those locations.

Running the Example

To run this example properly, you might want to add two extra API keys (Note if either key is missing, the code will fall back to dummy data, so they're not required):

With dependencies installed and environment variables set, run:

python/uv-run -m pydantic_ai_examples.weather_agent

Example Code

#! examples/pydantic_ai_examples/weather_agent.py

output/repo_parser/github_repos/pydantic/pydantic-ai/docs/examples/sql-gen.md

SQL Generation

Example demonstrating how to use PydanticAI to generate SQL queries based on user input.

Demonstrates:

Running the Example

The resulting SQL is validated by running it as an EXPLAIN query on PostgreSQL. To run the example, you first need to run PostgreSQL, e.g. via Docker:

docker run --rm -e POSTGRES_PASSWORD=postgres -p 54320:5432 postgres

(we run postgres on port 54320 to avoid conflicts with any other postgres instances you may have running)

With dependencies installed and environment variables set, run:

python/uv-run -m pydantic_ai_examples.sql_gen

or to use a custom prompt:

python/uv-run -m pydantic_ai_examples.sql_gen "find me errors"

This model uses gemini-1.5-flash by default since Gemini is good at single shot queries of this kind.

Example Code

#! examples/pydantic_ai_examples/sql_gen.py

output/repo_parser/github_repos/pydantic/pydantic-ai/docs/examples/flight-booking.md

Example of a multi-agent flow where one agent delegates work to another, then hands off control to a third agent.

Demonstrates:

In this scenario, a group of agents work together to find the best flight for a user.

The control flow for this example can be summarised as follows:

graph TD
  START --> search_agent("search agent")
  search_agent --> extraction_agent("extraction agent")
  extraction_agent --> search_agent
  search_agent --> human_confirm("human confirm")
  human_confirm --> search_agent
  search_agent --> FAILED
  human_confirm --> find_seat_function("find seat function")
  find_seat_function --> human_seat_choice("human seat choice")
  human_seat_choice --> find_seat_agent("find seat agent")
  find_seat_agent --> find_seat_function
  find_seat_function --> buy_flights("buy flights")
  buy_flights --> SUCCESS
Loading

Running the Example

With dependencies installed and environment variables set, run:

python/uv-run -m pydantic_ai_examples.flight_booking

Example Code

#! examples/pydantic_ai_examples/flight_booking.py

output/repo_parser/github_repos/pydantic/pydantic-ai/docs/examples/stream-whales.md

Information about whales — an example of streamed structured response validation.

Demonstrates:

This script streams structured responses from GPT-4 about whales, validates the data and displays it as a dynamic table using rich as the data is received.

Running the Example

With dependencies installed and environment variables set, run:

python/uv-run -m pydantic_ai_examples.stream_whales

Should give an output like this:

{{ video('53dd5e7664c20ae90ed90ae42f606bf3', 25) }}

Example Code

#! examples/pydantic_ai_examples/stream_whales.py

output/repo_parser/github_repos/pydantic/pydantic-ai/docs/examples/index.md

Examples

Examples of how to use PydanticAI and what it can do.

Usage

These examples are distributed with pydantic-ai so you can run them either by cloning the pydantic-ai repo or by simply installing pydantic-ai from PyPI with pip or uv.

Installing required dependencies

Either way you'll need to install extra dependencies to run some examples, you just need to install the examples optional dependency group.

If you've installed pydantic-ai via pip/uv, you can install the extra dependencies with:

pip/uv-add 'pydantic-ai[examples]'

If you clone the repo, you should instead use uv sync --extra examples to install extra dependencies.

Setting model environment variables

These examples will need you to set up authentication with one or more of the LLMs, see the model configuration docs for details on how to do this.

TL;DR: in most cases you'll need to set one of the following environment variables:

=== "OpenAI"

```bash
export OPENAI_API_KEY=your-api-key
```

=== "Google Gemini"

```bash
export GEMINI_API_KEY=your-api-key
```

Running Examples

To run the examples (this will work whether you installed pydantic_ai, or cloned the repo), run:

python/uv-run -m pydantic_ai_examples.<example_module_name>

For examples, to run the very simple pydantic_model example:

python/uv-run -m pydantic_ai_examples.pydantic_model

If you like one-liners and you're using uv, you can run a pydantic-ai example with zero setup:

OPENAI_API_KEY='your-api-key' \
  uv run --with 'pydantic-ai[examples]' \
  -m pydantic_ai_examples.pydantic_model

You'll probably want to edit examples in addition to just running them. You can copy the examples to a new directory with:

python/uv-run -m pydantic_ai_examples --copy-to examples/

output/repo_parser/github_repos/pydantic/pydantic-ai/docs/examples/pydantic-model.md

Pydantic Model

Simple example of using PydanticAI to construct a Pydantic model from a text input.

Demonstrates:

Running the Example

With dependencies installed and environment variables set, run:

python/uv-run -m pydantic_ai_examples.pydantic_model

This examples uses openai:gpt-4o by default, but it works well with other models, e.g. you can run it with Gemini using:

PYDANTIC_AI_MODEL=gemini-1.5-pro python/uv-run -m pydantic_ai_examples.pydantic_model

(or PYDANTIC_AI_MODEL=gemini-1.5-flash ...)

Example Code

#! examples/pydantic_ai_examples/pydantic_model.py

output/repo_parser/github_repos/pydantic/pydantic-ai/docs/examples/rag.md

RAG

RAG search example. This demo allows you to ask question of the logfire documentation.

Demonstrates:

This is done by creating a database containing each section of the markdown documentation, then registering the search tool with the PydanticAI agent.

Logic for extracting sections from markdown files and a JSON file with that data is available in this gist.

PostgreSQL with pgvector is used as the search database, the easiest way to download and run pgvector is using Docker:

mkdir postgres-data
docker run --rm \
  -e POSTGRES_PASSWORD=postgres \
  -p 54320:5432 \
  -v `pwd`/postgres-data:/var/lib/postgresql/data \
  pgvector/pgvector:pg17

As with the SQL gen example, we run postgres on port 54320 to avoid conflicts with any other postgres instances you may have running. We also mount the PostgreSQL data directory locally to persist the data if you need to stop and restart the container.

With that running and dependencies installed and environment variables set, we can build the search database with (WARNING: this requires the OPENAI_API_KEY env variable and will calling the OpenAI embedding API around 300 times to generate embeddings for each section of the documentation):

python/uv-run -m pydantic_ai_examples.rag build

(Note building the database doesn't use PydanticAI right now, instead it uses the OpenAI SDK directly.)

You can then ask the agent a question with:

python/uv-run -m pydantic_ai_examples.rag search "How do I configure logfire to work with FastAPI?"

Example Code

#! examples/pydantic_ai_examples/rag.py

output/repo_parser/github_repos/pydantic/pydantic-ai/docs/examples/stream-markdown.md

This example shows how to stream markdown from an agent, using the rich library to highlight the output in the terminal.

It'll run the example with both OpenAI and Google Gemini models if the required environment variables are set.

Demonstrates:

Running the Example

With dependencies installed and environment variables set, run:

python/uv-run -m pydantic_ai_examples.stream_markdown

Example Code

#! examples/pydantic_ai_examples/stream_markdown.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment