Budget Controls & Billing Transparency

Every HASP organization has a configurable AI credit budget. Credits are consumed on each successful inference request based on token usage. Budget controls let you cap spend, receive alerts before you hit the cap, and handle exhaustion gracefully.

Credit model

AI usage is billed in credits. The conversion rate depends on the model:

Model	Credits per 1M input tokens	Credits per 1M output tokens
claude-haiku-4-5	80	400
claude-sonnet-4-6	300	1,500
claude-opus-4	1,500	7,500

Credits are not currency — they are a normalized unit that maps to a dollar value on your invoice. The exact rate per credit is shown in your billing dashboard.

Monthly budget cap

Each organization has a monthly credit allotment. When credits are exhausted, inference requests fail with:

{
  "type": "error",
  "error": {
    "type": "permission_error",
    "message": "AI credits exhausted. Upgrade your plan or purchase overage credits.",
    "hasp_code": "BUDGET_EXCEEDED"
  }
}

On the native surface, the error code is CREDITS_EXHAUSTED and the HTTP status is 402. On the Anthropic-compat surface, it maps to BUDGET_EXCEEDED with status 429 (to match Anthropic’s rate-limit status convention).

Configuring a spend cap

Set a hard cap below your plan’s allotment via PUT /v1/usage/budget:

curl -X PUT https://api.usehasp.com/v1/usage/budget \
  -H "Authorization: Bearer hasp_key_live_<key>" \
  -H "Content-Type: application/json" \
  -d '{"spend_cap": 50000}'

Once the cap is reached, requests fail with BUDGET_EXCEEDED until the billing period resets or the cap is raised.

Spend alerts

Configure alert thresholds via the dashboard (Settings → AI Workspace → Budget Alerts). HASP sends email alerts when usage crosses 50%, 80%, and 100% of your configured cap. Webhook events for budget thresholds are on the roadmap.

Per-request usage in responses

Every non-streaming response includes token usage in the usage field:

{
  "usage": {
    "input_tokens": 120,
    "output_tokens": 85
  }
}

For streaming responses, usage appears in the message_delta SSE event with usage.output_tokens and in a final UsageUpdate event.

Checking current usage

curl https://api.usehasp.com/v1/usage \
  -H "Authorization: Bearer hasp_key_live_<key>"

The response includes credits_used, credits_allotment, credits_remaining, and a per-model breakdown for the current billing period.

​Credit model

​Monthly budget cap

​Configuring a spend cap

​Spend alerts

​Per-request usage in responses

​Checking current usage