ai:chat scope.
Request
Body parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
message | string | Yes | The user message content. |
model | string | No | Model ID. Defaults to claude-sonnet-4-6. See Models. |
stream | boolean | No | Stream the response via SSE. Default true. |
conversation_id | string | No | ULID of an existing conversation to continue. Omit to start a new conversation. |
store | boolean | No | Persist the request/response content. Default true. Set false for stateless requests — audit events are still written regardless. |
system | string | No | System prompt. Overrides the org default if set. |
max_tokens | integer | No | Maximum output tokens. |
Non-streaming response
Whenstream: false:
Streaming response (SSE)
Whenstream: true (default), the response is a standard text/event-stream. Each event follows this envelope:
Event sequence
A typical successful run emits events in this order:run.created— run persisted; inference not yet startedrun.started— upstream inference initiatedmessage.created— assistant message row created; deltas are about to beginmessage.delta(repeated) — content chunks as they arrivemessage.completed— message finalizedusage.updated— final token/credit roll-uprun.completed— run finished
run.failed is emitted instead of run.completed (see Error events).
Event types
run.created
message.delta
run.completed
run.failed
Failure-shape envelope — both data.object and data.error are present:
Error events
The standaloneerror event fires for auth/rate-limit failures that happen before a run could be created (no resource to snapshot):
Consuming the stream (JavaScript)
Cancellation
Close the SSE connection to cancel. The server detects the disconnect, emitsrun.cancel_requested → run.cancelled, and halts the upstream call. No additional API call is needed.
Models
| Model ID | Relative cost | Default access |
|---|---|---|
claude-haiku-4-5 | 0.33× Sonnet | Enabled |
claude-sonnet-4-6 | 1.0× (anchor) | Enabled |
claude-opus-4-7 | 5.0× Sonnet | Opt-in required |
403 OPUS_NOT_ENABLED.
Error codes
| Code | HTTP | Description |
|---|---|---|
INVALID_API_KEY | 401 | Missing or revoked token |
BAA_REQUIRED | 402 | No active BAA on the org |
AI_CREDITS_EXHAUSTED | 402 | Org has exhausted its credit allotment |
MISSING_SCOPE | 403 | Key lacks ai:chat scope |
PHI_BLOCKED | 403 | Message contains PHI and phi_mode=block is set |
FEATURE_NOT_HIPAA_ELIGIBLE | 403 | Requested AI feature is not HIPAA-eligible and is disabled |
OPUS_NOT_ENABLED | 403 | Opus model not enabled for this org |
RATE_LIMITED | 429 | RPM or daily limit exceeded — check Retry-After |
INFERENCE_UPSTREAM_FAILURE | 502 | Upstream model provider error — retryable |