Quotas and rate limits
Two limits per team: a per-second request rate and a per-month event allowance. Short bursts above the rate are tolerated; sustained excess is rejected.
Per-tier limits
| Tier | Sustained rate | Burst rate | Events / month |
|---|---|---|---|
| Starter | 100 req/sec | 200 req/sec | 100k |
| Team | 500 req/sec | 1,000 req/sec | 1M |
| Business | 2,000 req/sec | 5,000 req/sec | 10M |
| Enterprise | custom | custom | custom |
What counts as a request
One row in events, or one ingest delivery:
POST /v1/runs= 1 eventPOST /v1/runs/{id}/events= 1 per item in the batchPOST /v1/runs/{id}/finish= 1 eventPOST /v1/heartbeats= 1 eventGET /v1/ping= 1 event
Score writes are part of the finish payload and don't count separately. A typical run with one LLM call and a clean finish is three events: insert, llm_call, finish. Read endpoints (control API) have their own, more permissive limits.
Event cap behaviour
Three phases per calendar month, UTC:
| Range | Behaviour |
|---|---|
| Below 100% | Normal ingest, no banner, no email. |
| 100% to 150% (soft cap) | Ingest continues. |
| Above 150% (hard cap) | 429 Too Many Requests with Retry-After = seconds to next UTC month boundary. |
A Starter team on 100k events/month can ingest up to 150k before the door closes. Past 150k, every subsequent ingest returns 429 until the first of the month.
Retry-After semantics
HTTP/1.1 429 Too Many Requests
Content-Type: application/json
Retry-After: 1209600
{
"error": "quota_exceeded",
"message": "Monthly event cap exceeded. Resets at 2026-06-01T00:00:00Z."
}
Counts down to the deadline. Hit the cap at noon UTC on the 15th and see Retry-After: 1296000 (15 days); hit it at 23:59 UTC on the 31st and see Retry-After: 60. Reset is always the start of the next UTC month.
SDKs honour Retry-After automatically. After 5 retries, the affected envelopes drop and dropped_count increments. Agent code is never blocked or slowed.
Sustained-rate excess
When you exceed sustained for long enough to drain the burst bucket, 429 returns immediately, no Retry-After. Slow down. SDKs apply exponential back-off.
Distinct from the monthly 429: rate-limit 429 clears as soon as incoming volume drops below sustained, typically within seconds.
Agent count and retention
| Quota | Behaviour |
|---|---|
| Agent cap | At ceiling, you can't create new agents. Existing agents keep ingesting. Upgrade to lift. |
| Retention | Events past the tier's retention window prune on partition drop, no user action. Upgrading widens the window for events still in range; pruned events are gone. |
Resets
Monthly allowance resets at the first of each calendar month, 00:00 UTC. Hard-cap 429s clear at that moment. Banners and warning emails clear too; cross the threshold again later in the month and they reappear.
Sustained-rate enforcement is continuous; the bucket refills constantly.
Monitoring
The billing page surfaces live usage: events this month, percentage of cap, sustained-rate utilisation in the last hour, historical usage. Soft cap triggers an email and banner; hard cap triggers 429.
When you're close to the cap
- Upgrade. One click in the dashboard, immediate effect.
- Reduce volume at source. Audit auto-instrumentation. One event per stream chunk burns through caps fast.
- Sample at the call site. Gate
run.event(...)with a random check for high-volume agents. Token counts onllm_callevents remain authoritative for cost; sampling logs doesn't skew Spend.
Sales reaches out to teams that hit the soft cap two months running. The conversation is free.