Appearance
Are you an LLM? You can read better optimized documentation at /sipgate-ai-flow-api/api/barge-in.md for this page in Markdown format
Barge-In Configuration
Control how users can interrupt the assistant while speaking.
Overview
Barge-in allows users to interrupt the assistant's speech. Configure it per action using the barge_in field.
Configuration
json
{
"type": "speak",
"session_id": "session-123",
"text": "Hello!",
"barge_in": {
"strategy": "minimum_characters",
"minimum_characters": 3,
"allow_after_ms": 500
}
}Strategies
none
Disables barge-in completely. Audio plays fully without interruption.
json
{
"barge_in": {
"strategy": "none"
}
}Use cases:
- Critical information
- Legal disclaimers
- Emergency instructions
manual
Allows manual barge-in via API only (no automatic detection).
json
{
"barge_in": {
"strategy": "manual"
}
}Use cases:
- Custom interruption logic
- Button-triggered interruption
- External event-based interruption
minimum_characters
Automatically detects barge-in when user speech exceeds character threshold.
json
{
"barge_in": {
"strategy": "minimum_characters",
"minimum_characters": 5,
"allow_after_ms": 500
}
}Use cases:
- Natural conversation flow
- Customer service scenarios
- Interactive voice menus
immediate ⚡ NEW
Most responsive option - Interrupts immediately when user starts speaking, using Voice Activity Detection (VAD).
json
{
"barge_in": {
"strategy": "immediate",
"allow_after_ms": 500
}
}How it works:
- Azure/Deepgram: Uses VAD (Voice Activity Detection) - triggers before any text is recognized
- ElevenLabs: Uses first partial transcript
- Latency: 20-100ms (2-4x faster than
minimum_characters) - No text required: Interrupts on voice detection, not transcription
Use cases:
- High-priority conversations requiring instant responsiveness
- Natural dialogue where interruptions should feel seamless
- Customer service where quick response matters
- Urgent or time-sensitive interactions
Best practices:
- Use
allow_after_ms: 500-1000to prevent accidental interruptions at start - Test with real users to find optimal
allow_after_msvalue - Consider network latency in production environments
Comparison with minimum_characters:
| Feature | immediate | minimum_characters |
|---|---|---|
| Trigger | Voice Activity (VAD) | Text recognition (3+ characters) |
| Latency | 20-100ms | 50-200ms |
| User Experience | Instant interruption | Slight delay |
| Accuracy | May trigger on noise | More reliable (text-based) |
Configuration Options
minimum_characters
Minimum number of characters before barge-in triggers.
- Default:
3 - Range:
1to100 - Higher values: Require more speech before interruption
allow_after_ms
Delay in milliseconds before barge-in is allowed (protection period).
- Default:
0(immediate) - Range:
0to10000(10 seconds) - Use: Prevent interruption during critical information
Examples
Natural Conversation
json
{
"type": "speak",
"session_id": "session-123",
"text": "I can help you with billing, support, or sales.",
"barge_in": {
"strategy": "minimum_characters",
"minimum_characters": 3
}
}Critical Information
json
{
"type": "speak",
"session_id": "session-123",
"text": "Your verification code is 1-2-3-4-5-6.",
"barge_in": {
"strategy": "none"
}
}Protected Announcement
json
{
"type": "speak",
"session_id": "session-123",
"text": "Your account number is 1234567890.",
"barge_in": {
"strategy": "minimum_characters",
"minimum_characters": 10,
"allow_after_ms": 2000
}
}Instant Response (Immediate) ⚡
json
{
"type": "speak",
"session_id": "session-123",
"text": "I can help you with your order, account, or technical support. What would you like to know?",
"barge_in": {
"strategy": "immediate",
"allow_after_ms": 500
}
}Result: Assistant stops speaking the moment user starts talking (20-100ms latency), providing the most natural conversation experience.
Best Practices
- Use
nonesparingly - Only for truly critical information - Choose the right strategy:
immediate- For most natural, responsive conversationsminimum_characters- For balance between responsiveness and reliabilitymanual- For custom logicnone- For critical announcements only
- Set protection periods - Use
allow_after_ms: 500-1000to prevent cutting off important intro - Test with users - Find the right balance for your use case
- Consider noise -
immediatemay trigger on background noise; useallow_after_msas buffer
Next Steps
- Speak Action - How to use barge-in
- User Speak Event with Barge-In Flag - Handle interruptions