Anthropic

Auto-instrument the Anthropic SDK. Every messages.create call inside an active run emits an llm_call event with model, token usage, latency, and tool use. Cost is computed server-side from the rate card.

Install / enable

Python:

pip install agentping-sdk anthropic
import agentping
import anthropic

agentping.init()
agentping.instrument_anthropic()

client = anthropic.Anthropic()

with agentping.run("support-triage", customer_id="acme-corp") as r:
    reply = client.messages.create(
        model="claude-sonnet-4-5",
        max_tokens=1024,
        messages=[{"role": "user", "content": "summarise this ticket"}],
    )
    r.score("confidence", 0.93)

instrument_anthropic() patches both anthropic.Anthropic and anthropic.AsyncAnthropic. Idempotent and global; safe to call once at module load.

TypeScript:

npm install @agentping/sdk @anthropic-ai/sdk
import Anthropic from "@anthropic-ai/sdk";
import * as agentping from "@agentping/sdk";

agentping.init({ apiKey: process.env.AGENTPING_API_KEY });

const run = agentping.run("answer-bot");
const anthropic = agentping.instrumentAnthropic(new Anthropic(), { run });

const reply = await anthropic.messages.create({
  model: "claude-sonnet-4-5",
  max_tokens: 1024,
  messages: [{ role: "user", content: "hi" }],
});

await run.finish({ status: "success" });

instrumentAnthropic returns a wrapped client whose return types match the original. No global patching; the wrapper holds the run reference for scoping.

What's captured

Model, input tokens, output tokens, and latency on every messages.create. Prompt caching is captured separately: cache_read_input_tokens lands as cached_input_tokens and cache_creation_input_tokens is recorded so cached and uncached spend reconcile against the rate card. Tool-use exchanges are captured as a single llm_call with the final usage figures; to see tool selection in the trace, also record an event when your code picks a tool:

r.event("tool_call", {"tool": "fetch_orders", "args": {"customer_id": "acme-corp"}})

Streaming

TypeScript: automatic. Both messages.create({ stream: true }) and messages.stream() are wrapped; the llm_call event fires once the stream drains, with final token totals.

Python: the non-streaming messages.create is auto-captured, but streaming (messages.stream() and messages.create(stream=True)) is not. Emit one event yourself after the stream completes:

with agentping.run("answer-bot") as r:
    started = time.perf_counter()
    with client.messages.stream(
        model="claude-sonnet-4-5",
        max_tokens=1024,
        messages=[{"role": "user", "content": "explain in one sentence"}],
    ) as stream:
        for chunk in stream.text_stream:
            print(chunk, end="", flush=True)
        final = stream.get_final_message()

    r.event("llm_call", {
        "provider": "anthropic",
        "model": final.model,
        "input_tokens": final.usage.input_tokens,
        "output_tokens": final.usage.output_tokens,
        "latency_ms": int((time.perf_counter() - started) * 1000),
    })

Source / notes

The rate card carries the current Claude model family; unknown models still land, with cost shown as null in the unpriced models report. For Claude on Bedrock, see Bedrock.