Appearance
Are you an LLM? You can read better optimized documentation at /sipgate-ai-flow-api/sdk/barge-in.md for this page in Markdown format
Barge-In Configuration
Control how users can interrupt the assistant while speaking.
Overview
Barge-in allows users to interrupt the assistant's speech. You can configure barge-in behavior for each speak or audio action.
Configuration
typescript
interface BargeInConfig {
strategy: "none" | "manual" | "minimum_characters" | "immediate";
minimum_characters?: number; // Default: 3 (only for minimum_characters)
allow_after_ms?: number; // Delay before allowing interruption
}Strategies
none
Disables barge-in completely. Audio plays fully without interruption.
typescript
barge_in: {
strategy: "none"
}Use cases:
- Critical information that must be heard
- Legal disclaimers
- Emergency instructions
Example:
typescript
return {
type: "speak",
session_id: event.session.id,
text: "This is important information. Please listen carefully.",
barge_in: {
strategy: "none",
},
};manual
Allows manual barge-in via API only (no automatic detection).
typescript
barge_in: {
strategy: "manual"
}Use cases:
- Custom interruption logic
- Button-triggered interruption
- External event-based interruption
Example:
typescript
return {
type: "speak",
session_id: event.session.id,
text: "Press a button to interrupt.",
barge_in: {
strategy: "manual",
},
};minimum_characters
Automatically detects barge-in when user speech exceeds character threshold.
typescript
barge_in: {
strategy: "minimum_characters",
minimum_characters: 5, // Trigger after 5 characters
allow_after_ms: 500 // Wait 500ms before allowing interruption
}Use cases:
- Natural conversation flow
- Customer service scenarios
- Interactive voice menus
Example:
typescript
return {
type: "speak",
session_id: event.session.id,
text: "How can I help you today?",
barge_in: {
strategy: "minimum_characters",
minimum_characters: 3,
},
};immediate ⚡ NEW
Most responsive option - Interrupts immediately when user starts speaking using Voice Activity Detection (VAD).
typescript
barge_in: {
strategy: "immediate",
allow_after_ms: 500 // Optional: protect first 500ms
}How it works:
- Azure/Deepgram: Uses Voice Activity Detection (VAD) - triggers before any text is recognized
- ElevenLabs: Uses first partial transcript
- Latency: 20-100ms (2-4x faster than
minimum_characters) - No text required: Interrupts on voice detection, not transcription
Use cases:
- High-priority conversations requiring instant responsiveness
- Natural dialogue where interruptions should feel seamless
- Customer service where quick response matters
- Urgent or time-sensitive interactions
Example:
typescript
onUserSpeak: async (event) => {
return {
type: "speak",
session_id: event.session.id,
text: "I can help you with billing, support, or sales. What would you like?",
barge_in: {
strategy: "immediate",
allow_after_ms: 500, // Protect first 500ms from accidental noise
},
};
}Comparison:
| Strategy | Trigger | Latency | Use Case |
|---|---|---|---|
immediate | Voice Activity (VAD) | 20-100ms | Most natural, instant response |
minimum_characters | Text recognition | 50-200ms | Balanced reliability |
manual | API call | N/A | Custom logic |
none | Never | N/A | Critical info only |
Best practices:
- Use
allow_after_ms: 500-1000to prevent accidental interruptions - Test with real users to find optimal settings
- Consider background noise in your environment
Protection Period
You can add a protection period to prevent interruption during critical parts of speech:
typescript
return {
type: "speak",
session_id: event.session.id,
text: "Your account number is 1234567890. Please write this down.",
barge_in: {
strategy: "minimum_characters",
minimum_characters: 10, // Require substantial speech
allow_after_ms: 2000, // Protect first 2 seconds
},
};Configuration Options
minimum_characters
The minimum number of characters the user must speak before barge-in is triggered.
- Default:
3 - Range:
1to100 - Use: Higher values require more speech before interruption
allow_after_ms
Delay in milliseconds before barge-in is allowed. This creates a "protection period" at the start of speech.
- Default:
0(immediate) - Range:
0to10000(10 seconds) - Use: Prevent interruption during critical information
Examples
Natural Conversation
typescript
onUserSpeak: async (event) => {
return {
type: "speak",
session_id: event.session.id,
text: "I can help you with billing, support, or sales. What would you like?",
barge_in: {
strategy: "minimum_characters",
minimum_characters: 3,
},
};
}Critical Information
typescript
onUserSpeak: async (event) => {
return {
type: "speak",
session_id: event.session.id,
text: "Your verification code is 1-2-3-4-5-6. Please write this down.",
barge_in: {
strategy: "none", // Don't allow interruption
},
};
}Protected Announcement
typescript
onSessionStart: async (event) => {
return {
type: "speak",
session_id: event.session.id,
text: "Welcome! Your call may be recorded for quality assurance.",
barge_in: {
strategy: "minimum_characters",
minimum_characters: 5,
allow_after_ms: 3000, // Protect first 3 seconds
},
};
}Best Practices
- Use
nonesparingly - Only for truly critical information - Choose the right strategy:
immediate- For most natural, responsive conversationsminimum_characters- For balance between responsiveness and reliabilitymanual- For custom logicnone- For critical announcements only
- Set protection periods - Use
allow_after_ms: 500-1000to prevent cutting off important intro - Test with users - Find the right balance for your use case
- Consider noise -
immediatemay trigger on background noise; useallow_after_msas buffer
Next Steps
- Action Types - Complete action reference
- API Reference - Full API documentation