Skip to content

Action Types

Complete reference for all actions you can send to the AI Flow service.

Overview

Actions are JSON objects you send back to the AI Flow service in response to events. All actions require a session_id and type field.

Base Action Structure

json
{
  "session_id": "550e8400-e29b-41d4-a716-446655440000",
  "type": "speak"
}

Action Summary

Action TypeDescriptionPrimary Use Case
speakSpeak text or SSMLRespond to user with synthesized speech
audioPlay pre-recorded audioPlay hold music, pre-recorded messages
mix_audioLoop a background sound mixed into speechAdd ambient noise (café, office, train station) under the agent
hangupEnd the callTerminate conversation
transferTransfer to another numberRoute to human agent or department
barge_inManually interrupt playbackStop current audio immediately
configure_transcriptionChange STT language(s) mid-callSwitch recognition language without hanging up
configure_voice_to_voiceSwitch the session into end-to-end voice-to-voice modeHand the conversation to a speech-to-speech model that owns audio I/O
send_smsSend an SMS from the accountDeliver confirmation codes, summaries, links

Quick Reference

Response Format

HTTP Webhook

Return a single action or an array of actions as JSON with 200 OK:

http
HTTP/1.1 200 OK
Content-Type: application/json

{
  "type": "speak",
  "session_id": "550e8400-e29b-41d4-a716-446655440000",
  "text": "Hello!"
}

To execute multiple actions in sequence, return an array:

http
HTTP/1.1 200 OK
Content-Type: application/json

[
  {
    "type": "barge_in",
    "session_id": "550e8400-e29b-41d4-a716-446655440000"
  },
  {
    "type": "speak",
    "session_id": "550e8400-e29b-41d4-a716-446655440000",
    "text": "Sorry, let me correct that."
  }
]

Or return 204 No Content if no action is needed:

http
HTTP/1.1 204 No Content

WebSocket

Send a single action or an array of actions as JSON strings:

json
{
  "type": "speak",
  "session_id": "550e8400-e29b-41d4-a716-446655440000",
  "text": "Hello!"
}
json
[
  { "type": "barge_in", "session_id": "..." },
  { "type": "speak", "session_id": "...", "text": "Sorry, let me correct that." }
]

Action Flow

Common Patterns

Simple Response

json
{
  "type": "speak",
  "session_id": "session-123",
  "text": "Hello! How can I help you?"
}

Conditional Response

python
if "goodbye" in event['text'].lower():
    return {
        "type": "hangup",
        "session_id": event['session']['id']
    }
else:
    return {
        "type": "speak",
        "session_id": event['session']['id'],
        "text": "I understand."
    }

Multiple Actions

You can return an array of actions to execute them in sequence:

python
if event['type'] == 'user_speak':
    return [
        {
            "type": "barge_in",
            "session_id": event['session']['id']
        },
        {
            "type": "speak",
            "session_id": event['session']['id'],
            "text": "Sorry, let me correct that."
        }
    ]

Actions in the array are executed one after another in order.

Alternatively, you can chain actions across events using the assistant_speak event:

python
# First response
if event['type'] == 'user_speak':
    return {
        "type": "speak",
        "session_id": event['session']['id'],
        "text": "Please listen to this message."
    }

# Follow-up after assistant speaks
if event['type'] == 'assistant_speak':
    return {
        "type": "audio",
        "session_id": event['session']['id'],
        "audio": "base64-audio-data"
    }

Next Steps