User Speech Started Event

Triggered when the user's speech is first detected — before the full transcript is available. Uses Voice Activity Detection (VAD) and typically fires 20–120 ms after the user starts speaking.

WebSocket only

This event is only delivered via WebSocket connections. It is not sent to HTTP webhook endpoints.

Event Structure

json

{
  "type": "user_speech_started",
  "session": {
    "id": "550e8400-e29b-41d4-a716-446655440000",
    "account_id": "account-123",
    "phone_number": "1234567890",
    "direction": "inbound",
    "from_phone_number": "9876543210",
    "to_phone_number": "1234567890"
  }
}

Fields

Field	Type	Required	Description
`type`	string	Yes	Always `"user_speech_started"`
`session.id`	string (UUID)	Yes	Session identifier
`session.account_id`	string	Yes	Account identifier
`session.phone_number`	string	Yes	Phone number for this flow session

Behaviour

Fires at most once per speech turn — subsequent partial transcripts within the same turn are suppressed
Resets automatically after the corresponding user_speak event is received, so it fires again on the next speech turn
No response or actions are expected; the service ignores any payload returned for this event

Use Cases

Show "user is speaking" indicators in real-time dashboards or call monitoring UIs
Start latency optimisations early — e.g. pre-warm LLM context or fetch data before the full transcript arrives
Interrupt ongoing workflows — cancel queued background processing when the user begins to speak

Example (TypeScript SDK)

typescript

import { AiFlowAssistant } from '@sipgate/ai-flow-sdk';
import WebSocket from 'ws';

const assistant = AiFlowAssistant.create({
  onUserSpeechStarted: async (event) => {
    console.log('User started speaking, session:', event.session.id);
    // No return value needed
  },

  onUserSpeak: async (event) => {
    return `You said: ${event.text}`;
  },
});

const wss = new WebSocket.Server({ port: 3000 });
wss.on('connection', (ws) => {
  ws.on('message', assistant.ws(ws));
});

Example (Raw WebSocket)

javascript

ws.on('message', (data) => {
  const event = JSON.parse(data.toString());

  if (event.type === 'user_speech_started') {
    console.log('User started speaking in session', event.session.id);
    // No response needed — the service ignores any reply
  }

  if (event.type === 'user_speak') {
    ws.send(JSON.stringify({
      type: 'speak',
      session_id: event.session.id,
      text: `You said: ${event.text}`,
    }));
  }
});

Next Steps

User Speak Event - Full transcript after STT completes
Barge-In Guide - Interrupting assistant speech
WebSocket Integration - How to connect via WebSocket

User Speech Started Event ​

Event Structure ​

Fields ​

Behaviour ​

Use Cases ​

Example (TypeScript SDK) ​

Example (Raw WebSocket) ​

Next Steps ​