Appearance
User Speak Event
Triggered when the user speaks and speech-to-text completes.
Event Structure
json
{
"type": "user_speak",
"text": "Hello, I need help",
"barged_in": false,
"session": {
"id": "550e8400-e29b-41d4-a716-446655440000",
"account_id": "account-123",
"phone_number": "+1234567890"
}
}Barge-In Detection
When a user interrupts the assistant mid-speech, the event includes barged_in: true:
json
{
"type": "user_speak",
"text": "Wait",
"barged_in": true,
"session": {
"id": "550e8400-e29b-41d4-a716-446655440000",
"account_id": "account-123",
"phone_number": "+1234567890"
}
}Fields
| Field | Type | Required | Description |
|---|---|---|---|
type | string | Yes | Always "user_speak" |
text | string | Yes | Recognized speech text |
barged_in | boolean | No | true if user interrupted assistant, false or omitted otherwise |
session.id | string (UUID) | Yes | Session identifier |
session.account_id | string | Yes | Account identifier |
session.phone_number | string | Yes | Phone number for this flow session |
Response
You can return any action or 204 No Content. Common responses:
Speak Back
json
{
"type": "speak",
"session_id": "550e8400-e29b-41d4-a716-446655440000",
"text": "I understand. How can I help you?"
}Transfer Call
json
{
"type": "transfer",
"session_id": "550e8400-e29b-41d4-a716-446655440000",
"target_phone_number": "+1234567890",
"caller_id_name": "Support",
"caller_id_number": "+1234567890"
}Hangup
json
{
"type": "hangup",
"session_id": "550e8400-e29b-41d4-a716-446655440000"
}Examples
Python (Flask)
python
@app.route('/webhook', methods=['POST'])
def webhook():
event = request.json
if event['type'] == 'user_speak':
session_id = event['session']['id']
user_text = event['text'].lower()
if 'goodbye' in user_text or 'bye' in user_text:
return jsonify({
'type': 'hangup',
'session_id': session_id
})
if 'transfer' in user_text:
return jsonify({
'type': 'transfer',
'session_id': session_id,
'target_phone_number': '+1234567890',
'caller_id_name': 'Support',
'caller_id_number': '+1234567890'
})
return jsonify({
'type': 'speak',
'session_id': session_id,
'text': f"You said: {event['text']}"
})
return '', 204Node.js (Express)
javascript
app.post('/webhook', (req, res) => {
const event = req.body;
if (event.type === 'user_speak') {
const userText = event.text.toLowerCase();
if (userText.includes('goodbye') || userText.includes('bye')) {
return res.json({
type: 'hangup',
session_id: event.session.id
});
}
if (userText.includes('transfer')) {
return res.json({
type: 'transfer',
session_id: event.session.id,
target_phone_number: '+1234567890',
caller_id_name: 'Support',
caller_id_number: '+1234567890'
});
}
return res.json({
type: 'speak',
session_id: event.session.id,
text: `You said: ${event.text}`
});
}
res.status(204).send();
});Go
go
func webhook(w http.ResponseWriter, r *http.Request) {
var event map[string]interface{}
json.NewDecoder(r.Body).Decode(&event)
if event["type"] == "user_speak" {
session := event["session"].(map[string]interface{})
text := strings.ToLower(event["text"].(string))
if strings.Contains(text, "goodbye") || strings.Contains(text, "bye") {
action := map[string]interface{}{
"type": "hangup",
"session_id": session["id"],
}
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(action)
return
}
action := map[string]interface{}{
"type": "speak",
"session_id": session["id"],
"text": "You said: " + event["text"].(string),
}
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(action)
return
}
w.WriteHeader(http.StatusNoContent)
}Handling Barge-In
You can check the barged_in flag to provide special handling for interruptions:
python
@app.route('/webhook', methods=['POST'])
def webhook():
event = request.json
if event['type'] == 'user_speak':
if event.get('barged_in'):
# User interrupted - acknowledge quickly
return jsonify({
'type': 'speak',
'text': 'Yes, I\'m listening.'
})
else:
# Normal speech processing
return process_user_input(event['text'])See the Barge-In Best Practices Guide for detailed strategies.
Use Cases
- Process user input - Understand what the user wants
- Detect interruptions - Handle barge-in with
barged_inflag - Route conversations - Direct to appropriate handler
- Collect information - Gather details from user
- Transfer calls - Route to human agents
- End calls - Handle goodbye messages
Best Practices
- Process quickly - Respond within 1-2 seconds
- Handle barge-in gracefully - Check
barged_inflag for interruptions - Handle errors - Always return a valid response
- Log interactions - Track conversation for analytics
- Validate input - Check for expected patterns
Next Steps
- Assistant Speak Event - Track when assistant speaks
- Action Types - All available actions
- Event Flow - Understand the complete flow