Fonada V1

System voices

Fonada V1 is our standard Text-to-Speech model, offering curated, high-quality system voices in Hindi, Tamil, Telugu, and English. Perfect for customer support automation, educational content, accessibility features, and interactive voice applications.

Overview

Fonada V1 turns text into lifelike audio with natural intonation and clear pronunciation using pre-built catalog voices. Send a JSON request to the TTS endpoint with a voice name and a language label — no setup or training required. You can try it in the TTS playground by selecting the Fonada V1 model.

  • Generate natural media campaigns & ads
  • Stream real-time audio from text

Need voice cloning?

To synthesize speech with community / cloned voices from Voice Arena, see the Klone V2 Pro documentation.

Listen to a sample:

Sample Audio
Hindi Female Voice
00:15
"क्या आपने कभी सोचा है कि अगर हर ज़रूरी जानकारी, मौसम से लेकर ताज़ा खबरों तक, सिर्फ़ एक कमांड पर मिल जाए, तो आपका रोज़मर्रा का जीवन कितना आसान और मज़ेदार हो सकता है?"

Explore our voice library to find the perfect voice for your project.

API Usage

Generate high-quality speech from text using pre-built system voices. Send JSON to the TTS endpoint with a voice name and language label.

Character Limit

Maximum 450 characters per request. For longer content, split your text into multiple API calls.

Generate Audio - Basic Request

curl
curl -X POST "https://api.fonada.ai/tts/generate-audio-large" \ -H "Content-Type: application/json" \ -H "Authorization: Bearer YOUR_API_KEY" \ -d '{"input": "क्या आपने कभी सोचा है कि अगर हर ज़रूरी जानकारी, मौसम से लेकर ताज़ा खबरों तक, सिर्फ़ एक कमांड पर मिल जाए, तो आपका रोज़मर्रा का जीवन कितना आसान और मज़ेदार हो सकता है?", "voice": "Dhruv", "language": "Hindi"}' \ --output output_large.mp3

Request Parameters

ParameterTypeRequiredDescription
inputstringThe text you want to convert to speech
voicestringVoice name to use for synthesis
languagestringLanguage for speech synthesis ("Hindi", "Tamil", "Telugu", "English")

Returns MP3 audio file as binary data with the following headers:

Content-Type: audio/mpeg
Content-Length: [file_size]

Use --output filename.mp3 to save directly to file

API Endpoint

Base URL: https://api.fonada.ai
Endpoint: /tts/generate-audio-large
Method: POST
Content-Type: application/json

WebSocket Usage

Generate real-time streaming audio from text using our WebSocket endpoint for low-latency applications.

📡 WebSocket Endpoints

Production: wss://api.fonada.ai/tts/generate-audio-ws

WebSocket Request Parameters

ParameterTypeRequiredDescription
api_keystringYour Fonadalabs API key for authentication
inputstringThe text you want to convert to speech
voicestringVoice name to use for synthesis (choose from 215+ available voices for Hindi, Tamil, Telugu, and English)
languagestringLanguage for speech synthesis ("Hindi", "Tamil", "Telugu", "English")

WebSocket Response Format

The WebSocket streams audio data in real-time chunks:

Data Type: MP3 format
Streaming: Audio data is sent progressively as it's generated
Sample Rate: 24kHz
Channels: Mono

Each message event contains a chunk of the audio file that can be played immediately or buffered for smoother playback.

WebSocket Benefits

  • Real-time streaming: Audio starts playing while text is still being processed
  • Lower latency: Perfect for conversational AI and live applications
  • Efficient bandwidth: Progressive audio delivery reduces waiting time
  • Better UX: Users hear audio output immediately, improving perceived performance

Available Voices

Choose from our extensive collection of 111+ high-quality voices optimized for Hindi, Tamil, Telugu, and English languages.

Note: Use the voice name exactly as shown in your API requests, e.g., "voice": "Dhruv"

Hindi Voices (18)

Tamil Voices (15)

Telugu Voices (60)

English Voices (18)

Using Different Voices

For Hindi Content:
{"input": "नमस्ते", "voice": "Dhruv", "language": "Hindi"}
For Tamil Content:
{"input": "வணக்கம்", "voice": "Vaani", "language": "Tamil"}
For Telugu Content:
{"input": "నమస్కారం", "voice": "Naadamu", "language": "Telugu"}
For English Content:
{"input": "Hello, welcome", "voice": "Dhruv", "language": "English"}

Voice quality

Fonada V1 provides high-quality, natural-sounding speech synthesis with support for Hindi, Tamil, Telugu, and English languages, optimized for real-time conversational applications.

Fonada V1

Standard text-to-speech

System voices

Pre-built catalog voices with Hindi, Tamil, Telugu, and English — JSON REST API, 450 characters per request.

API endpoint
/tts/generate-audio-large
Languages
4 (Hindi 18, Tamil 15, Telugu 60, English 18)
Voice param
voice name + language label

Supported formats

The default response format is .mp3

MP3

Sample rates:24kHz
Channels: Mono

Supported languages

Fonada V1 currently supports 4 languages:

Hindi, Tamil, Telugu, English

Note: We support code-mixing between English and Hindi, which is commonly used in Indian conversations.

Text Input

Fonada V1 converts plain text into natural-sounding speech. Simply provide your text input, and the model will generate high-quality audio output with natural intonation and pronunciation.

Current Capabilities

  • Natural speech synthesis from plain text
  • Support for English, Hindi, Tamil, and Telugu languages
  • High-quality audio output with clear pronunciation
  • Optimized for conversational applications

Coming Soon: Future versions will include advanced features like emotional context, voice modulation, and custom voice styles. Stay tuned for updates!

FAQ

  • Format: MP3 (default output)
  • Sample Rate: 24kHz for clear, natural speech
  • Quality: Natural prosody with crystal-clear pronunciation
  • Languages: Hindi, Tamil, Telugu, and English

We use cookies

We use cookies to analyze site usage and improve your experience. By clicking "Accept", you consent to our use of cookies.Learn more