Skip to content

User Speak Event

Triggered when the user speaks and speech-to-text completes.

Event Structure

json
{
  "type": "user_speak",
  "text": "Hello, I need help",
  "barged_in": false,
  "session": {
    "id": "550e8400-e29b-41d4-a716-446655440000",
    "account_id": "account-123",
    "phone_number": "+1234567890"
  }
}

Barge-In Detection

When a user interrupts the assistant mid-speech, the event includes barged_in: true:

json
{
  "type": "user_speak",
  "text": "Wait",
  "barged_in": true,
  "session": {
    "id": "550e8400-e29b-41d4-a716-446655440000",
    "account_id": "account-123",
    "phone_number": "+1234567890"
  }
}

Fields

FieldTypeRequiredDescription
typestringYesAlways "user_speak"
textstringYesRecognized speech text
barged_inbooleanNotrue if user interrupted assistant, false or omitted otherwise
session.idstring (UUID)YesSession identifier
session.account_idstringYesAccount identifier
session.phone_numberstringYesPhone number for this flow session

Response

You can return any action or 204 No Content. Common responses:

Speak Back

json
{
  "type": "speak",
  "session_id": "550e8400-e29b-41d4-a716-446655440000",
  "text": "I understand. How can I help you?"
}

Transfer Call

json
{
  "type": "transfer",
  "session_id": "550e8400-e29b-41d4-a716-446655440000",
  "target_phone_number": "+1234567890",
  "caller_id_name": "Support",
  "caller_id_number": "+1234567890"
}

Hangup

json
{
  "type": "hangup",
  "session_id": "550e8400-e29b-41d4-a716-446655440000"
}

Examples

Python (Flask)

python
@app.route('/webhook', methods=['POST'])
def webhook():
    event = request.json

    if event['type'] == 'user_speak':
        session_id = event['session']['id']
        user_text = event['text'].lower()

        if 'goodbye' in user_text or 'bye' in user_text:
            return jsonify({
                'type': 'hangup',
                'session_id': session_id
            })

        if 'transfer' in user_text:
            return jsonify({
                'type': 'transfer',
                'session_id': session_id,
                'target_phone_number': '+1234567890',
                'caller_id_name': 'Support',
                'caller_id_number': '+1234567890'
            })

        return jsonify({
            'type': 'speak',
            'session_id': session_id,
            'text': f"You said: {event['text']}"
        })

    return '', 204

Node.js (Express)

javascript
app.post('/webhook', (req, res) => {
  const event = req.body;

  if (event.type === 'user_speak') {
    const userText = event.text.toLowerCase();

    if (userText.includes('goodbye') || userText.includes('bye')) {
      return res.json({
        type: 'hangup',
        session_id: event.session.id
      });
    }

    if (userText.includes('transfer')) {
      return res.json({
        type: 'transfer',
        session_id: event.session.id,
        target_phone_number: '+1234567890',
        caller_id_name: 'Support',
        caller_id_number: '+1234567890'
      });
    }

    return res.json({
      type: 'speak',
      session_id: event.session.id,
      text: `You said: ${event.text}`
    });
  }

  res.status(204).send();
});

Go

go
func webhook(w http.ResponseWriter, r *http.Request) {
    var event map[string]interface{}
    json.NewDecoder(r.Body).Decode(&event)

    if event["type"] == "user_speak" {
        session := event["session"].(map[string]interface{})
        text := strings.ToLower(event["text"].(string))

        if strings.Contains(text, "goodbye") || strings.Contains(text, "bye") {
            action := map[string]interface{}{
                "type":       "hangup",
                "session_id": session["id"],
            }
            w.Header().Set("Content-Type", "application/json")
            json.NewEncoder(w).Encode(action)
            return
        }

        action := map[string]interface{}{
            "type":       "speak",
            "session_id": session["id"],
            "text":       "You said: " + event["text"].(string),
        }
        w.Header().Set("Content-Type", "application/json")
        json.NewEncoder(w).Encode(action)
        return
    }

    w.WriteHeader(http.StatusNoContent)
}

Handling Barge-In

You can check the barged_in flag to provide special handling for interruptions:

python
@app.route('/webhook', methods=['POST'])
def webhook():
    event = request.json

    if event['type'] == 'user_speak':
        if event.get('barged_in'):
            # User interrupted - acknowledge quickly
            return jsonify({
                'type': 'speak',
                'text': 'Yes, I\'m listening.'
            })
        else:
            # Normal speech processing
            return process_user_input(event['text'])

See the Barge-In Best Practices Guide for detailed strategies.

Use Cases

  • Process user input - Understand what the user wants
  • Detect interruptions - Handle barge-in with barged_in flag
  • Route conversations - Direct to appropriate handler
  • Collect information - Gather details from user
  • Transfer calls - Route to human agents
  • End calls - Handle goodbye messages

Best Practices

  1. Process quickly - Respond within 1-2 seconds
  2. Handle barge-in gracefully - Check barged_in flag for interruptions
  3. Handle errors - Always return a valid response
  4. Log interactions - Track conversation for analytics
  5. Validate input - Check for expected patterns

Next Steps