Skip to content

TTS Providers

Configure text-to-speech providers for different voices and languages.

Overview

The AI Flow service supports multiple TTS providers. Configure them per action in the tts field.

Supported Providers

  • Azure Cognitive Services - 400+ voices in 140+ languages
  • ElevenLabs - Ultra-realistic conversational voices

Azure Cognitive Services

Configuration

json
{
  "type": "speak",
  "session_id": "session-123",
  "text": "Hello!",
  "tts": {
    "provider": "azure",
    "language": "en-US",
    "voice": "en-US-JennyNeural"
  }
}
LanguageVoice NameGenderDescription
en-USen-US-JennyNeuralFemaleFriendly, professional
en-USen-US-GuyNeuralMaleClear, neutral
en-GBen-GB-SoniaNeuralFemaleBritish, professional
en-GBen-GB-RyanNeuralMaleBritish, friendly
de-DEde-DE-KatjaNeuralFemaleProfessional, clear
de-DEde-DE-ConradNeuralMaleDeep, authoritative

Full Voice List: See Azure TTS documentation

ElevenLabs

Configuration

json
{
  "type": "speak",
  "session_id": "session-123",
  "text": "Hello!",
  "tts": {
    "provider": "eleven_labs",
    "voice": "21m00Tcm4TlvDq8ikWAM"
  }
}

Voice IDs

The voice field is optional and accepts the ElevenLabs voice ID as a string. For example, "21m00Tcm4TlvDq8ikWAM" for "Rachel". If omitted, the first available voice will be used.

Minimal Configuration (uses default voice):

json
{
  "type": "speak",
  "session_id": "session-123",
  "text": "Hello!",
  "tts": {
    "provider": "eleven_labs"
  }
}

Available Voices

Voice NameIDDescription
Rachel21m00Tcm4TlvDq8ikWAMMatter-of-fact, personable woman. Great for conversational use cases.
Drew29vD33N1CtxCmqQRPOHJ-
Clyde2EiwWnXFnvU5JabPnv8nGreat for character use-cases
Paul5Q0t7uMcjvnagumLfvZi-
Aria9BWtsMINqrJLrRacOk9xMiddle-aged female with African-American accent. Calm with hint of rasp.
DomiAZnzlk1XvdvUeBnXmlld-
DaveCYw3kZ02Hs0563khs1Fj-
RogerCwhRBWXzGAHq8TQ4Fs17Easy going and perfect for casual conversations.
FinD38z5RcWu1voky8WS1ja-
SarahEXAVITQu4vr4xnSDxMaLYoung adult woman with confident, warm tone. Reassuring and professional.
AntoniErXwobaYiN019PkySvjV-
LauraFGY2WhTYpPnrIDTdsKH5Young adult female with sunny enthusiasm and quirky attitude.
ThomasGBv7mTt0atIp3Br8iCZESoft and subdued male voice, optimal for narrations or meditations
CharlieIKne3meq5aSn9XLyUdCDYoung Australian male with confident and energetic voice.
GeorgeJBFqnCBsd6RMkjVDRZzbWarm resonance that instantly captivates listeners.
EmilyLcfcDJNUP1GQjkzn1xUU-
ElliMF3mGyEYCl7XYWbV9V6O-
CallumN2lVS1w4EtoT3dr4eOWODeceptively gravelly, yet unsettling edge.
PatrickODq5zmih8GrVes37Dizd-
RiverSAz9YHcvj6GT2YYXdXwwRelaxed, neutral voice ready for narrations or conversational projects.
HarrySOYHLrjzK2X1ezoPC6crAn animated warrior ready to charge forward.
LiamTX3LPaxmHKxFdv7VOQHJYoung adult with energy and warmth - suitable for reels and shorts.
DorothyThT5KcBeYPX3keUQqHPh-
JoshTxGEqnHWrfWFTfGW9XjX-
ArnoldVR6AewLTigWG4xSOukaG-
CharlotteXB0fDUnXU5powFXDhCwaSensual and raspy, ready to voice your temptress in video games.
AliceXb7hH8MSUJpSbSDYk0k2Clear and engaging British woman, suitable for e-learning.
MatildaXrExE9yKIg1WjnnlVkGXProfessional woman with pleasing alto pitch. Suitable for many use cases.
JamesZQe5CZNOzWyzPSCn5a3c-
JosephZlb1dXrM653N07WRdFW3-
WillbIHbv24MWmeRgasZH58oConversational and laid back.
JeremybVMeCyTHy58xNoL34h3p-
JessicacgSgspJ2msm6clMCkdW9Young and playful American female, perfect for trendy content.
EriccjVigY5qzO86Huf0OWalSmooth tenor pitch from man in his 40s - perfect for agentic use cases.
Michaelflq6f7yk4E4fJM5XTYuZ-
Ethang5CIjZEefAph4nQFvHAz-
ChrisiP95p4xoKVk53GoZ742BNatural and real, down-to-earth voice great across many use-cases.
GigijBpfuIE2acCO8z3wKNLl-
FreyajsCqWAovK2LkecY7zXl4-
BriannPczCjzI2devNBz1zQrbMiddle-aged man with resonant and comforting tone. Great for narrations.
GraceoWAxZDx7w5VEj9dCyTzz-
DanielonwK4e9ZLuTAKqWW03F9Strong voice perfect for professional broadcast or news story.
LilypFZP5JQG7iQjIQuC4BkuVelvety British female voice delivers news with warmth and clarity.
SerenapMsXgVXv3BLzUgSXRplE-
AdampNInz6obpgDQGcFmaJgB-
NicolepiTKgcLEGmPE4e6mEKli-
BillpqHfZKP75CvOlQylNhV4Friendly and comforting voice ready to narrate your stories.
Jessiet0jbNlBVZ17f02VDIeMI-
SamyoZ06aMxZJJ28mfd3POQ-
Glindaz9fAnlkpzviPz146aGWa-
GiovannizcAOhNBS3c14rBihAFp1-
MimizrHiDhphv9ZnVXBqCLjz-

Choosing a Provider

Use Azure when:

  • You need many languages (140+)
  • You want consistent quality
  • You need regional accents
  • Budget is a concern

Use ElevenLabs when:

  • You need the most natural voices
  • Conversational quality is critical
  • You're working with English/European languages
  • You want distinct personalities

Examples

Python

python
# Azure voice
action = {
    'type': 'speak',
    'session_id': session_id,
    'text': 'Hello!',
    'tts': {
        'provider': 'azure',
        'language': 'en-US',
        'voice': 'en-US-JennyNeural'
    }
}

# ElevenLabs voice
action = {
    'type': 'speak',
    'session_id': session_id,
    'text': 'Hello!',
    'tts': {
        'provider': 'eleven_labs',
        'voice': '21m00Tcm4TlvDq8ikWAM'  # Rachel
    }
}

Next Steps