Anthropic
Auto-instrument the Anthropic SDK. Every messages.create call inside an active run emits an llm_call event with model, token usage, latency, and tool use. Cost is computed server-side from the rate card.
Install / enable
Python:
pip install agentping-sdk anthropic
import agentping
import anthropic
agentping.init()
agentping.instrument_anthropic()
client = anthropic.Anthropic()
with agentping.run("support-triage", customer_id="acme-corp") as r:
reply = client.messages.create(
model="claude-sonnet-4-5",
max_tokens=1024,
messages=[{"role": "user", "content": "summarise this ticket"}],
)
r.score("confidence", 0.93)
instrument_anthropic() patches both anthropic.Anthropic and anthropic.AsyncAnthropic. Idempotent and global; safe to call once at module load.
TypeScript:
npm install @agentping/sdk @anthropic-ai/sdk
import Anthropic from "@anthropic-ai/sdk";
import * as agentping from "@agentping/sdk";
agentping.init({ apiKey: process.env.AGENTPING_API_KEY });
const run = agentping.run("answer-bot");
const anthropic = agentping.instrumentAnthropic(new Anthropic(), { run });
const reply = await anthropic.messages.create({
model: "claude-sonnet-4-5",
max_tokens: 1024,
messages: [{ role: "user", content: "hi" }],
});
await run.finish({ status: "success" });
instrumentAnthropic returns a wrapped client whose return types match the original. No global patching; the wrapper holds the run reference for scoping.
What's captured
Model, input tokens, output tokens, and latency on every messages.create. Prompt caching is captured separately: cache_read_input_tokens lands as cached_input_tokens and cache_creation_input_tokens is recorded so cached and uncached spend reconcile against the rate card. Tool-use exchanges are captured as a single llm_call with the final usage figures; to see tool selection in the trace, also record an event when your code picks a tool:
r.event("tool_call", {"tool": "fetch_orders", "args": {"customer_id": "acme-corp"}})
Streaming
TypeScript: automatic. Both messages.create({ stream: true }) and messages.stream() are wrapped; the llm_call event fires once the stream drains, with final token totals.
Python: the non-streaming messages.create is auto-captured, but streaming (messages.stream() and messages.create(stream=True)) is not. Emit one event yourself after the stream completes:
with agentping.run("answer-bot") as r:
started = time.perf_counter()
with client.messages.stream(
model="claude-sonnet-4-5",
max_tokens=1024,
messages=[{"role": "user", "content": "explain in one sentence"}],
) as stream:
for chunk in stream.text_stream:
print(chunk, end="", flush=True)
final = stream.get_final_message()
r.event("llm_call", {
"provider": "anthropic",
"model": final.model,
"input_tokens": final.usage.input_tokens,
"output_tokens": final.usage.output_tokens,
"latency_ms": int((time.perf_counter() - started) * 1000),
})
Source / notes
- Python:
agentping.instrument_anthropic()in agentping-sdk-python - TypeScript:
instrumentAnthropicin agentping-sdk-typescript
The rate card carries the current Claude model family; unknown models still land, with cost shown as null in the unpriced models report. For Claude on Bedrock, see Bedrock.