Appearance
Are you an LLM? You can read better optimized documentation at /sipgate-ai-flow-api/sdk/barge-in.md for this page in Markdown format
Barge-In Configuration
Control how users can interrupt the assistant while speaking.
Overview
Barge-in allows users to interrupt the assistant's speech. You can configure barge-in behavior for each speak or audio action.
Configuration
typescript
interface BargeInConfig {
strategy: "none" | "manual" | "minimum_characters" | "immediate";
minimum_characters?: number; // Default: 3 (only for minimum_characters)
allow_after_ms?: number; // Delay before allowing interruption
}Strategies
none
Disables barge-in completely. Audio plays fully without interruption.
typescript
barge_in: {
strategy: "none"
}Use cases:
- Critical information that must be heard
- Legal disclaimers
- Emergency instructions
Example:
typescript
return {
type: "speak",
session_id: event.session.id,
text: "This is important information. Please listen carefully.",
barge_in: {
strategy: "none",
},
};manual
Allows manual barge-in via API only (no automatic detection).
typescript
barge_in: {
strategy: "manual"
}Use cases:
- Custom interruption logic
- Button-triggered interruption
- External event-based interruption
Example:
typescript
return {
type: "speak",
session_id: event.session.id,
text: "Press a button to interrupt.",
barge_in: {
strategy: "manual",
},
};minimum_characters
Automatically detects barge-in when user speech exceeds character threshold.
typescript
barge_in: {
strategy: "minimum_characters",
minimum_characters: 5, // Trigger after 5 characters
allow_after_ms: 500 // Wait 500ms before allowing interruption
}Use cases:
- Natural conversation flow
- Customer service scenarios
- Interactive voice menus
Example:
typescript
return {
type: "speak",
session_id: event.session.id,
text: "How can I help you today?",
barge_in: {
strategy: "minimum_characters",
minimum_characters: 3,
},
};immediate ⚡ NEW
Most responsive option - Interrupts immediately when user starts speaking using Voice Activity Detection (VAD).
typescript
barge_in: {
strategy: "immediate",
allow_after_ms: 500 // Optional: protect first 500ms
}How it works:
- Azure/Deepgram: Uses Voice Activity Detection (VAD) - triggers before any text is recognized
- ElevenLabs: Uses first partial transcript
- Latency: 20-100ms (2-4x faster than
minimum_characters) - No text required: Interrupts on voice detection, not transcription
Use cases:
- High-priority conversations requiring instant responsiveness
- Natural dialogue where interruptions should feel seamless
- Customer service where quick response matters
- Urgent or time-sensitive interactions
Example:
typescript
onUserSpeak: async (event) => {
return {
type: "speak",
session_id: event.session.id,
text: "I can help you with billing, support, or sales. What would you like?",
barge_in: {
strategy: "immediate",
allow_after_ms: 500, // Protect first 500ms from accidental noise
},
};
}Comparison:
| Strategy | Trigger | Latency | Use Case |
|---|---|---|---|
immediate | Voice Activity (VAD) | 20-100ms | Most natural, instant response |
minimum_characters | Text recognition | 50-200ms | Balanced reliability |
manual | API call | N/A | Custom logic |
none | Never | N/A | Critical info only |
Best practices:
- Use
allow_after_ms: 500-1000to prevent accidental interruptions - Test with real users to find optimal settings
- Consider background noise in your environment
Protection Period
You can add a protection period to prevent interruption during critical parts of speech:
typescript
return {
type: "speak",
session_id: event.session.id,
text: "Your account number is 1234567890. Please write this down.",
barge_in: {
strategy: "minimum_characters",
minimum_characters: 10, // Require substantial speech
allow_after_ms: 2000, // Protect first 2 seconds
},
};Configuration Options
minimum_characters
The minimum number of characters the user must speak before barge-in is triggered.
- Default:
3 - Range:
1to100 - Use: Higher values require more speech before interruption
allow_after_ms
Delay in milliseconds before barge-in is allowed. This creates a "protection period" at the start of speech.
- Default:
0(immediate) - Range:
0to10000(10 seconds) - Use: Prevent interruption during critical information
Examples
Natural Conversation
typescript
onUserSpeak: async (event) => {
return {
type: "speak",
session_id: event.session.id,
text: "I can help you with billing, support, or sales. What would you like?",
barge_in: {
strategy: "minimum_characters",
minimum_characters: 3,
},
};
}Critical Information
typescript
onUserSpeak: async (event) => {
return {
type: "speak",
session_id: event.session.id,
text: "Your verification code is 1-2-3-4-5-6. Please write this down.",
barge_in: {
strategy: "none", // Don't allow interruption
},
};
}Protected Announcement
typescript
onSessionStart: async (event) => {
return {
type: "speak",
session_id: event.session.id,
text: "Welcome! Your call may be recorded for quality assurance.",
barge_in: {
strategy: "minimum_characters",
minimum_characters: 5,
allow_after_ms: 3000, // Protect first 3 seconds
},
};
}Best Practices
- Use
nonesparingly - Only for truly critical information - Choose the right strategy:
immediate- For most natural, responsive conversationsminimum_characters- For balance between responsiveness and reliabilitymanual- For custom logicnone- For critical announcements only
- Set protection periods - Use
allow_after_ms: 500-1000to prevent cutting off important intro - Test with users - Find the right balance for your use case
- Consider noise -
immediatemay trigger on background noise; useallow_after_msas buffer
Related: VAD Configuration
Barge-in controls whether the caller may interrupt the assistant while it is speaking. The related VAD Configuration controls how long the caller may pause before their turn is considered finished. Both can be set on the same speak action.
Next Steps
- Action Types - Complete action reference
- VAD Configuration - Tune end-of-turn silence
- API Reference - Full API documentation