Appearance
Complete Action Reference
Complete reference for all actions in the SDK.
Base Action Structure
All actions require a session_id and type field:
typescript
interface BaseAction {
session_id: string; // UUID from the event's session.id
type: string; // Action type identifier
}All Action Types
| Action Type | Description | Primary Use Case |
|---|---|---|
speak | Speak text or SSML | Respond to user with synthesized speech |
audio | Play pre-recorded audio | Play hold music, pre-recorded messages |
mix_audio | Loop a background sound mixed into speech | Add ambient noise (café, office, train station) under the agent |
hangup | End the call | Terminate conversation |
transfer | Transfer to another number | Route to human agent or department |
barge_in | Manually interrupt playback | Stop current audio immediately |
configure_transcription | Change STT language(s) mid-call | Switch recognition language without hanging up |
Action Type Definitions
speak - Text-to-speech response
typescript
interface AiFlowActionSpeak {
type: "speak";
session_id: string;
// Provide either text OR ssml (not both)
text?: string;
ssml?: string;
// Optional TTS configuration
tts?: {
provider: "azure";
language?: string; // e.g., "en-US", "de-DE"
voice?: string; // Azure voice name
} | {
provider: "eleven_labs";
voice?: string; // ElevenLabs voice ID (optional, uses default if omitted)
};
barge_in?: {
strategy: "none" | "manual" | "minimum_characters";
minimum_characters?: number; // Default: 3
allow_after_ms?: number; // Delay before allowing interruption
};
}audio - Play pre-recorded audio
typescript
interface AiFlowActionAudio {
type: "audio";
session_id: string;
audio: string; // Base64 encoded WAV (16kHz, mono, 16-bit PCM)
barge_in?: {
strategy: "none" | "manual" | "minimum_characters";
minimum_characters?: number;
allow_after_ms?: number;
};
}mix_audio - Loop a background sound under outbound speech
typescript
interface AiFlowActionMixAudio {
type: "mix_audio";
session_id: string;
audio?: string; // Base64 WAV (16 kHz, mono, 16-bit PCM); required unless stop=true
volume?: number; // 0.0–1.0, default 0.5
stop?: boolean; // true to remove the active loop
}The loop plays continuously for the rest of the call — under TTS during turns and on its own during silences. Sending mix_audio again replaces the loop. The loop is dropped automatically when the session ends.
hangup - End call
typescript
interface AiFlowActionHangup {
type: "hangup";
session_id: string;
}transfer - Transfer call
typescript
interface AiFlowActionTransfer {
type: "transfer";
session_id: string;
target_phone_number: string; // E.164 format recommended
caller_id_name: string;
caller_id_number: string;
/**
* Optional transfer timeout in seconds (5–120). When set, a failed transfer
* returns the call to the agent via a new `session_start` event for the
* same session id (transfer fallback). Omit for legacy behavior where a
* failed transfer ends the call.
*/
timeout?: number;
}barge_in - Manual interrupt
typescript
interface AiFlowActionBargeIn {
type: "barge_in";
session_id: string;
}configure_transcription - Change STT language mid-call
typescript
interface AiFlowActionConfigureTranscription {
type: "configure_transcription";
session_id: string;
provider?: "AZURE" | "DEEPGRAM" | "ELEVEN_LABS"; // Omit to keep current provider.
languages?: string[]; // BCP-47 codes, 1-4 entries. Omit to reset to provider default.
}Multi-language support: Azure uses all supplied language codes for simultaneous detection (up to 4). Deepgram performs multilingual auto-detection across the supplied languages. ElevenLabs accepts only a single language — when multiple codes are provided, only the first is used and the rest are silently ignored.
Type Safety
All actions are fully typed. Import types from the SDK:
typescript
import type {
AiFlowAction,
AiFlowActionSpeak,
AiFlowActionAudio,
AiFlowActionHangup,
AiFlowActionTransfer,
AiFlowActionBargeIn,
AiFlowActionConfigureTranscription,
} from "@sipgate/ai-flow-sdk";Next Steps
- Complete Event Reference - All event types
- Direct Integration - Working without the wrapper