Quotas and rate limits

Two numbers per team: a per-second request rate (a server-protection limit) and your plan's included events per month (a published number, never a meter). Short bursts above the rate are tolerated; sustained excess is rejected.

Per-tier limits

Tier	Sustained rate	Burst rate	Included events / month
Free	100 req/sec	200 req/sec	100k
Starter	100 req/sec	200 req/sec	2M
Team	500 req/sec	1,000 req/sec	10M
Business	2,000 req/sec	5,000 req/sec	25M
Enterprise	custom	custom	custom

What counts as a request

One row in events, or one ingest delivery:

POST /v1/runs = 1 event
POST /v1/runs/{id}/events = 1 per item in the batch
POST /v1/runs/{id}/finish = 1 event
POST /v1/heartbeats = 1 event
GET /v1/ping = 1 event

Score writes are part of the finish payload and don't count separately. A typical run with one LLM call and a clean finish is three events: insert, llm_call, finish. Read endpoints (control API) have their own, more permissive limits.

Included events behaviour

Going over your included events never breaks ingestion and never costs extra:

Range	Behaviour
Below 80%	Normal ingest, no banner, no email.
80% to 99%	Heads-up email and dashboard banner. Ingestion unchanged.
100% and above	Persistent banner, email to the account owner. Ingestion continues.

If an account keeps exceeding its included events month after month, we reach out and find the right plan together. There is no automated cut-off and no overage charge.

Sustained-rate excess

When you exceed sustained for long enough to drain the burst bucket, 429 returns immediately, no Retry-After. Slow down. SDKs apply exponential back-off.

Distinct from the monthly 429: rate-limit 429 clears as soon as incoming volume drops below sustained, typically within seconds.

Agent count and retention

Quota	Behaviour
Agent cap	At ceiling, you can't create new agents. Existing agents keep ingesting. Upgrade to lift.
Retention	Two horizons per plan, enforced nightly; see data retention. Upgrading widens the window for data still in range; deleted data is gone.

Resets

Included-events counters reset at the first of each calendar month, 00:00 UTC. Banners and warning emails clear too; cross the threshold again later in the month and they reappear.

Sustained-rate enforcement is continuous; the bucket refills constantly.

Monitoring

The billing page surfaces live usage: events this month, percentage of your included amount, evaluations used, historical usage.

When you're close to the cap

Upgrade. One click in the dashboard, immediate effect.
Reduce volume at source. Audit auto-instrumentation. One event per stream chunk burns through caps fast.
Sample at the call site. Gate run.event(...) with a random check for high-volume agents. Token counts on llm_call events remain authoritative for cost; sampling logs doesn't skew Spend.

We reach out to accounts that exceed their included events two months running. The conversation is free, and nothing is ever switched off.