Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.usehasp.com/llms.txt

Use this file to discover all available pages before exploring further.

Every Hasp organization has a configurable AI credit budget. Credits are consumed on each successful inference request based on token usage. Budget controls let you cap spend, receive alerts before you hit the cap, and handle exhaustion gracefully.

Credit model

AI usage is billed in credits. The conversion rate depends on the model:
ModelCredits per 1M input tokensCredits per 1M output tokens
claude-haiku-4-580400
claude-sonnet-4-63001,500
claude-opus-41,5007,500
Credits are not currency — they are a normalized unit that maps to a dollar value on your invoice. The exact rate per credit is shown in your billing dashboard.

Monthly budget cap

Each organization has a monthly credit allotment. When credits are exhausted, inference requests fail with:
{
  "type": "error",
  "error": {
    "type": "permission_error",
    "message": "AI credits exhausted. Upgrade your plan or purchase overage credits.",
    "hasp_code": "BUDGET_EXCEEDED"
  }
}
On the native surface, the error code is CREDITS_EXHAUSTED and the HTTP status is 402. On the Anthropic-compat surface, it maps to BUDGET_EXCEEDED with status 429 (to match Anthropic’s rate-limit status convention).

Configuring a spend cap

Set a hard cap below your plan’s allotment via PUT /v1/usage/budget:
curl -X PUT https://api.usehasp.com/v1/usage/budget \
  -H "Authorization: Bearer wa_live_<key>" \
  -H "Content-Type: application/json" \
  -d '{"spend_cap": 50000}'
Once the cap is reached, requests fail with BUDGET_EXCEEDED until the billing period resets or the cap is raised.

Spend alerts

Configure alert thresholds via the dashboard (Settings → AI Workspace → Budget Alerts). Hasp sends email alerts when usage crosses 50%, 80%, and 100% of your configured cap. Webhook events for budget thresholds are on the roadmap.

Per-request usage in responses

Every non-streaming response includes token usage in the usage field:
{
  "usage": {
    "input_tokens": 120,
    "output_tokens": 85
  }
}
For streaming responses, usage appears in the message_delta SSE event with usage.output_tokens and in a final UsageUpdate event.

Checking current usage

curl https://api.usehasp.com/v1/usage \
  -H "Authorization: Bearer wa_live_<key>"
The response includes credits_used, credits_allotment, credits_remaining, and a per-model breakdown for the current billing period.