Pydantic AI

instrument_pydantic_ai() patches Agent.run_sync and Agent.run. Every run inside an active run emits one llm_call event with provider, model, total input/output tokens (cache hits split out), and latency. Pydantic AI's result.usage already aggregates across the whole agent loop, including tool retries, so one event per run is the right grain.

Provider and model are read automatically from agent.model.system and agent.model.model_name, so multi-provider apps (some agents on Claude, some on GPT-4o) are priced correctly with no extra wiring.

Install

pip install "agentping-sdk[pydantic-ai]"

Supports pydantic-ai >= 1.0.0.

Quickstart

import agentping
from pydantic_ai import Agent

agentping.init()
agentping.instrument_pydantic_ai()

agent = Agent(
    "openai:gpt-4o-mini",
    system_prompt="You triage customer support tickets.",
)

with agentping.run("support-triage", customer_id="acme-corp") as r:
    result = agent.run_sync("urgent: billing question on invoice 4821")
    r.score("confidence", 0.91)

instrument_pydantic_ai() is idempotent; call it once at module load. The async API is identical, use async with agentping.run(...) and await agent.run(...); both paths are patched.

Switching to another provider ("anthropic:claude-sonnet-4-5") needs no doc change: the provider and model are still read from the agent's model. Retries for structured output are already included in usage, so they surface as legitimate cost in Spend automatically.

What's captured

One llm_call event per Agent.run_sync() / Agent.run():

Field Source
provider agent.model.system
model agent.model.model_name
input_tokens / cached_input_tokens result.usage (cache reads split out)
output_tokens result.usage.output_tokens
cache_creation_input_tokens result.usage.cache_write_tokens
latency_ms wall-clock run duration

Advanced (optional)

The default emits one aggregated event per run. To pivot Spend by step (each model exchange and tool round-trip as its own event), walk the messages after the run and emit your own events alongside the automatic one:

with agentping.run("support-triage") as r:
    result = agent.run_sync("process this ticket")

    for message in result.all_messages():
        if message.kind == "response" and message.usage:
            r.event("llm_call", {
                "provider": "openai",
                "model": message.model_name,
                "input_tokens": message.usage.request_tokens,
                "output_tokens": message.usage.response_tokens,
            })

This is only useful for debugging specific tool round-trips inside a long agent loop; for accurate Spend and Pulse coverage the automatic event is enough.

Source