Appearance
User Speech Started Event
Triggered when the user's speech is first detected — before the full transcript is available. Uses Voice Activity Detection (VAD) and typically fires 20–120 ms after the user starts speaking.
WebSocket only
This event is only delivered via WebSocket connections. It is not sent to HTTP webhook endpoints.
Event Structure
json
{
"type": "user_speech_started",
"session": {
"id": "550e8400-e29b-41d4-a716-446655440000",
"account_id": "account-123",
"phone_number": "+1234567890",
"direction": "inbound",
"from_phone_number": "+9876543210",
"to_phone_number": "+1234567890"
}
}Fields
| Field | Type | Required | Description |
|---|---|---|---|
type | string | Yes | Always "user_speech_started" |
session.id | string (UUID) | Yes | Session identifier |
session.account_id | string | Yes | Account identifier |
session.phone_number | string | Yes | Phone number for this flow session |
Behaviour
- Fires at most once per speech turn — subsequent partial transcripts within the same turn are suppressed
- Resets automatically after the corresponding
user_speakevent is received, so it fires again on the next speech turn - No response or actions are expected; the service ignores any payload returned for this event
Use Cases
- Show "user is speaking" indicators in real-time dashboards or call monitoring UIs
- Start latency optimisations early — e.g. pre-warm LLM context or fetch data before the full transcript arrives
- Interrupt ongoing workflows — cancel queued background processing when the user begins to speak
Example (TypeScript SDK)
typescript
import { AiFlowAssistant } from '@sipgate/ai-flow-sdk';
import WebSocket from 'ws';
const assistant = AiFlowAssistant.create({
onUserSpeechStarted: async (event) => {
console.log('User started speaking, session:', event.session.id);
// No return value needed
},
onUserSpeak: async (event) => {
return `You said: ${event.text}`;
},
});
const wss = new WebSocket.Server({ port: 3000 });
wss.on('connection', (ws) => {
ws.on('message', assistant.ws(ws));
});Example (Raw WebSocket)
javascript
ws.on('message', (data) => {
const event = JSON.parse(data.toString());
if (event.type === 'user_speech_started') {
console.log('User started speaking in session', event.session.id);
// No response needed — the service ignores any reply
}
if (event.type === 'user_speak') {
ws.send(JSON.stringify({
type: 'speak',
session_id: event.session.id,
text: `You said: ${event.text}`,
}));
}
});Next Steps
- User Speak Event - Full transcript after STT completes
- Barge-In Guide - Interrupting assistant speech
- WebSocket Integration - How to connect via WebSocket