Appearance
Event Types
Complete reference for all events sent by the AI Flow service.
Overview
Events are JSON objects sent from the AI Flow service to your application. All events include a type field and session information.
Base Event Structure
All events include session information:
json
{
"session": {
"id": "550e8400-e29b-41d4-a716-446655440000",
"account_id": "account-123",
"phone_number": "1234567890",
"direction": "inbound",
"from_phone_number": "9876543210",
"to_phone_number": "1234567890"
}
}The direction field indicates whether the call was initiated by the caller ("inbound") or by the AI flow via the outbound call API ("outbound"). Use it in your session_start handler to tailor the greeting accordingly.
Event Types
| Event Type | Transport | Description | When Triggered |
|---|---|---|---|
session_start | HTTP + WebSocket | Call session begins | When a new call is initiated |
user_speech_started | WebSocket only | Speech onset detected | When VAD detects the user starting to speak (before full transcript) |
user_speak | HTTP + WebSocket | User speech detected | After speech-to-text completes (includes barged_in flag if user interrupted) |
dtmf_received | HTTP + WebSocket | DTMF digit pressed | When the user presses a key on their phone keypad |
assistant_speak | HTTP + WebSocket | Assistant finished speaking | After TTS playback completes |
assistant_speech_ended | HTTP + WebSocket | Assistant finished speaking | After speech playback ends |
user_input_timeout | HTTP + WebSocket | User input timeout reached | When no speech detected after timeout |
session_end | HTTP + WebSocket | Call session ends | When the call terminates |
sms_failed | HTTP + WebSocket | SMS delivery failed | After a send_sms action fails — includes reason so the agent can react |
Quick Reference
- Session Start - Call begins
- User Speech Started - Speech onset detected (WebSocket only)
- User Speak - User speaks (includes barge-in detection)
- DTMF Received - User pressed a phone key
- Assistant Speak - Assistant speaks
- Assistant Speech Ended - Assistant finished speaking
- User Input Timeout - Timeout reached waiting for user
- Session End - Call ends
- SMS Failed — emitted when a
send_smsaction fails; see below.
SMS Failed
Emitted to your webhook / WebSocket when a send_sms action fails. The call continues normally — handle this event to react conversationally (e.g. apologize, retry with a corrected number).
json
{
"type": "sms_failed",
"session": { "id": "550e8400-...", "account_id": "...", "phone_number": "...",
"from_phone_number": "...", "to_phone_number": "..." },
"recipient": "4915112345678",
"reason": "sender_not_allowed",
"message": "SMSC returned faultCode 403"
}| Field | Type | Description |
|---|---|---|
type | string | Always "sms_failed" |
session | object | Standard session info |
recipient | string | Phone number that failed (the phone_number from your send_sms action) |
reason | string | One of: sender_not_allowed, insufficient_balance, no_sms_extension, smsc_unavailable, unknown |
message | string | Optional human-readable detail (safe to log, may contain technical error text) |
See Send SMS Action for details on each failure reason.
Event Flow
Handling Events
HTTP Webhook
python
@app.route('/webhook', methods=['POST'])
def webhook():
event = request.json
event_type = event['type']
if event_type == 'session_start':
# Handle session start
pass
elif event_type == 'user_speak':
# Handle user speech
pass
# ... handle other eventsWebSocket
javascript
ws.on('message', (data) => {
const event = JSON.parse(data.toString());
switch (event.type) {
case 'session_start':
// Handle session start
break;
case 'user_speak':
// Handle user speech
break;
// ... handle other events
}
});Response Requirements
All events (except session_end) accept a single action, an array of actions (executed in sequence), or 204 No Content:
- session_start: Can return action(s) or
204 No Content - user_speak: Can return action(s) or
204 No Content(checkbarged_inflag for interruptions) - dtmf_received: Can return action(s) or
204 No Content - assistant_speak: Can return action(s) or
204 No Content - assistant_speech_ended: Can return action(s) or
204 No Content - user_input_timeout: Can return action(s) or
204 No Content - session_end: No action allowed, cleanup only
Next Steps
- Session Start Event - Detailed reference
- User Speak Event - Detailed reference
- Action Types - How to respond to events