Rate Limits

HASP enforces rate limits at two scopes — per API key and per organization — plus a separate per-cycle spend cap. All applicable limits must pass for a request to proceed.

The two scopes

Scope	What it protects against
Per-key	A single misbehaving key — a buggy loop or a leaked credential — overwhelming the system.
Per-org bucket	Load spread across many keys overwhelming upstream capacity. The org bucket is the aggregate ceiling for the whole organization.

A per-key cap can never exceed the org bucket: the org-tier bucket is the hard upper bound, and per-key limits are carved out beneath it. Raising a per-key cap above what the org’s tier allows is a tier-gated feature, not a per-key setting.

Per-key request limits

Per-key requests-per-minute and daily request caps scale by platform tier:

Tier	Requests per minute	Daily requests
Solo	60	5,000
Professional	500	50,000
Business	2,000	500,000
Enterprise	5,000	2,000,000

Tiers not listed fall through to the default (60 RPM / 5,000 daily).

Per-org tokens-per-minute caps

The org bucket also enforces a tokens-per-minute (TPM) ceiling on the API ladder:

API tier	Tokens per minute
Developer	100,000
Growth	500,000
Scale	2,000,000
Enterprise	Unlimited

The TPM cap is the org-level throttle that per-key allocations sit beneath.

Handling `429`

When a limit is hit, the response is a 429 with a retryable error:

{
  "success": false,
  "error": {
    "code": "RATE_LIMITED",
    "type": "rate_limited",
    "message": "Rate limit exceeded. Retry after 12 seconds.",
    "retryable": true,
    "details": { "retry_after_seconds": 12 }
  }
}

The Retry-After header (RFC 6585) is set to the same value. To handle 429 correctly:

Read Retry-After

Wait the number of seconds in the Retry-After header (or error.details.retry_after_seconds) before retrying.

Back off if no value is present

If neither is set, use exponential backoff with a ceiling rather than retrying immediately.

Respect the bucket

Persistent 429s mean you are at the org or tier ceiling — spread load over time or move up a tier rather than retrying harder.

Responses also carry rate-limit headers (X-RateLimit-Remaining-Requests, X-RateLimit-Reset-Requests, and token-level equivalents) so you can throttle proactively before hitting a limit.

Spend cap is a separate limit

A dollar spend cap is independent of RPM/TPM — it protects the bill across a billing cycle, not against burst traffic. When the cap is reached, requests return 402 AI_CREDITS_EXHAUSTED with an error.details.cycle_reset_at timestamp. Spend caps are configured in Settings → Billing.

Full reference

For response-header details and per-org aggregate RPM figures, see the AI API rate-limits reference:

AI API Rate Limits

The full three-layer model, response headers, and the spend-cap behavior.

​The two scopes

​Per-key request limits

​Per-org tokens-per-minute caps

​Handling 429

​Spend cap is a separate limit

​Full reference