Appearance
TTS Providers
Configure text-to-speech providers for different voices and languages.
Overview
The AI Flow service supports multiple TTS providers. Configure them per action in the tts field.
Supported Providers
- Azure Cognitive Services - 400+ voices in 140+ languages
- ElevenLabs - Ultra-realistic conversational voices
Azure Cognitive Services
Configuration
json
{
"type": "speak",
"session_id": "session-123",
"text": "Hello!",
"tts": {
"provider": "azure",
"language": "en-US",
"voice": "en-US-JennyNeural"
}
}Popular Voices
| Language | Voice Name | Gender | Description |
|---|---|---|---|
| en-US | en-US-JennyNeural | Female | Friendly, professional |
| en-US | en-US-GuyNeural | Male | Clear, neutral |
| en-GB | en-GB-SoniaNeural | Female | British, professional |
| en-GB | en-GB-RyanNeural | Male | British, friendly |
| de-DE | de-DE-KatjaNeural | Female | Professional, clear |
| de-DE | de-DE-ConradNeural | Male | Deep, authoritative |
Full Voice List: See Azure TTS documentation
ElevenLabs
Configuration
json
{
"type": "speak",
"session_id": "session-123",
"text": "Hello!",
"tts": {
"provider": "eleven_labs",
"voice": "21m00Tcm4TlvDq8ikWAM"
}
}Voice IDs
The voice field is optional and accepts the ElevenLabs voice ID as a string. For example, "21m00Tcm4TlvDq8ikWAM" for "Rachel". If omitted, the first available voice will be used.
Minimal Configuration (uses default voice):
json
{
"type": "speak",
"session_id": "session-123",
"text": "Hello!",
"tts": {
"provider": "eleven_labs"
}
}Available Voices
| Voice Name | ID | Description |
|---|---|---|
| Rachel | 21m00Tcm4TlvDq8ikWAM | Matter-of-fact, personable woman. Great for conversational use cases. |
| Drew | 29vD33N1CtxCmqQRPOHJ | - |
| Clyde | 2EiwWnXFnvU5JabPnv8n | Great for character use-cases |
| Paul | 5Q0t7uMcjvnagumLfvZi | - |
| Aria | 9BWtsMINqrJLrRacOk9x | Middle-aged female with African-American accent. Calm with hint of rasp. |
| Domi | AZnzlk1XvdvUeBnXmlld | - |
| Dave | CYw3kZ02Hs0563khs1Fj | - |
| Roger | CwhRBWXzGAHq8TQ4Fs17 | Easy going and perfect for casual conversations. |
| Fin | D38z5RcWu1voky8WS1ja | - |
| Sarah | EXAVITQu4vr4xnSDxMaL | Young adult woman with confident, warm tone. Reassuring and professional. |
| Antoni | ErXwobaYiN019PkySvjV | - |
| Laura | FGY2WhTYpPnrIDTdsKH5 | Young adult female with sunny enthusiasm and quirky attitude. |
| Thomas | GBv7mTt0atIp3Br8iCZE | Soft and subdued male voice, optimal for narrations or meditations |
| Charlie | IKne3meq5aSn9XLyUdCD | Young Australian male with confident and energetic voice. |
| George | JBFqnCBsd6RMkjVDRZzb | Warm resonance that instantly captivates listeners. |
| Emily | LcfcDJNUP1GQjkzn1xUU | - |
| Elli | MF3mGyEYCl7XYWbV9V6O | - |
| Callum | N2lVS1w4EtoT3dr4eOWO | Deceptively gravelly, yet unsettling edge. |
| Patrick | ODq5zmih8GrVes37Dizd | - |
| River | SAz9YHcvj6GT2YYXdXww | Relaxed, neutral voice ready for narrations or conversational projects. |
| Harry | SOYHLrjzK2X1ezoPC6cr | An animated warrior ready to charge forward. |
| Liam | TX3LPaxmHKxFdv7VOQHJ | Young adult with energy and warmth - suitable for reels and shorts. |
| Dorothy | ThT5KcBeYPX3keUQqHPh | - |
| Josh | TxGEqnHWrfWFTfGW9XjX | - |
| Arnold | VR6AewLTigWG4xSOukaG | - |
| Charlotte | XB0fDUnXU5powFXDhCwa | Sensual and raspy, ready to voice your temptress in video games. |
| Alice | Xb7hH8MSUJpSbSDYk0k2 | Clear and engaging British woman, suitable for e-learning. |
| Matilda | XrExE9yKIg1WjnnlVkGX | Professional woman with pleasing alto pitch. Suitable for many use cases. |
| James | ZQe5CZNOzWyzPSCn5a3c | - |
| Joseph | Zlb1dXrM653N07WRdFW3 | - |
| Will | bIHbv24MWmeRgasZH58o | Conversational and laid back. |
| Jeremy | bVMeCyTHy58xNoL34h3p | - |
| Jessica | cgSgspJ2msm6clMCkdW9 | Young and playful American female, perfect for trendy content. |
| Eric | cjVigY5qzO86Huf0OWal | Smooth tenor pitch from man in his 40s - perfect for agentic use cases. |
| Michael | flq6f7yk4E4fJM5XTYuZ | - |
| Ethan | g5CIjZEefAph4nQFvHAz | - |
| Chris | iP95p4xoKVk53GoZ742B | Natural and real, down-to-earth voice great across many use-cases. |
| Gigi | jBpfuIE2acCO8z3wKNLl | - |
| Freya | jsCqWAovK2LkecY7zXl4 | - |
| Brian | nPczCjzI2devNBz1zQrb | Middle-aged man with resonant and comforting tone. Great for narrations. |
| Grace | oWAxZDx7w5VEj9dCyTzz | - |
| Daniel | onwK4e9ZLuTAKqWW03F9 | Strong voice perfect for professional broadcast or news story. |
| Lily | pFZP5JQG7iQjIQuC4Bku | Velvety British female voice delivers news with warmth and clarity. |
| Serena | pMsXgVXv3BLzUgSXRplE | - |
| Adam | pNInz6obpgDQGcFmaJgB | - |
| Nicole | piTKgcLEGmPE4e6mEKli | - |
| Bill | pqHfZKP75CvOlQylNhV4 | Friendly and comforting voice ready to narrate your stories. |
| Jessie | t0jbNlBVZ17f02VDIeMI | - |
| Sam | yoZ06aMxZJJ28mfd3POQ | - |
| Glinda | z9fAnlkpzviPz146aGWa | - |
| Giovanni | zcAOhNBS3c14rBihAFp1 | - |
| Mimi | zrHiDhphv9ZnVXBqCLjz | - |
Choosing a Provider
Use Azure when:
- You need many languages (140+)
- You want consistent quality
- You need regional accents
- Budget is a concern
Use ElevenLabs when:
- You need the most natural voices
- Conversational quality is critical
- You're working with English/European languages
- You want distinct personalities
Examples
Python
python
# Azure voice
action = {
'type': 'speak',
'session_id': session_id,
'text': 'Hello!',
'tts': {
'provider': 'azure',
'language': 'en-US',
'voice': 'en-US-JennyNeural'
}
}
# ElevenLabs voice
action = {
'type': 'speak',
'session_id': session_id,
'text': 'Hello!',
'tts': {
'provider': 'eleven_labs',
'voice': '21m00Tcm4TlvDq8ikWAM' # Rachel
}
}Next Steps
- Speak Action - How to use TTS
- Barge-In Configuration - Control interruptions