API Documentation v1.0
Hindi Text-to-Speech API with Real-Time Streaming • hindi.heypixa.ai
Luna TTS provides high-quality Hindi speech synthesis with support for multiple dialects, voice customization, and emotion expression. Audio is streamed in real-time via Server-Sent Events (SSE) for low-latency playback.
11 Hindi dialects
Voice customization
Emotion tags
Real-time streaming
Low latency (~300ms)
32kHz audio
Quick Start
import base64, json, requests, wave # Make streaming request response = requests.post( "https://hindi.heypixa.ai/api/v1/synthesize", json={ "text": "<calm>नमस्ते, आप कैसे हैं?", "dialect": "hindi_delhi", "age": "middle-aged" }, stream=True ) # Collect audio chunks audio_data = b"" for line in response.iter_lines(): if line: line = line.decode('utf-8') if line.startswith('data: '): data = line[6:] if data == '[DONE]': break obj = json.loads(data) if 'audio' in obj: audio_data += base64.b64decode(obj['audio']) # Save to WAV with wave.open("output.wav", "wb") as wav: wav.setnchannels(1) wav.setsampwidth(2) wav.setframerate(32000) wav.writeframes(audio_data)
const response = await fetch('https://hindi.heypixa.ai/api/v1/synthesize', { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({ text: '<calm>नमस्ते, आप कैसे हैं?', dialect: 'hindi_delhi' }) }); const reader = response.body.getReader(); const decoder = new TextDecoder(); let buffer = ''; while (true) { const { done, value } = await reader.read(); if (done) break; buffer += decoder.decode(value, { stream: true }); while (buffer.includes('\n\n')) { const [event, rest] = buffer.split('\n\n', 2); buffer = rest || ''; if (event.startsWith('data: ')) { const data = event.slice(6); if (data === '[DONE]') return; const obj = JSON.parse(data); if (obj.audio) { // Decode and play audio chunk } } } }
curl -X POST "https://hindi.heypixa.ai/api/v1/synthesize" \ -H "Content-Type: application/json" \ -d '{ "text": "<calm>नमस्ते, आप कैसे हैं?", "dialect": "hindi_delhi", "age": "middle-aged" }'
API Endpoints
POST
/api/v1/synthesize
Stream audio synthesis
Request Body
| Parameter | Type | Required | Description |
|---|---|---|---|
text |
string | Required | Hindi text to synthesize (max 5000 chars) |
dialect |
string | Optional | Hindi dialect. Default: hindi_delhi |
age |
string | Optional | Speaker age. Default: middle-aged |
temperature |
float | Optional | Sampling temperature (0.0-2.0). Default: 0.9 |
top_p |
float | Optional | Top-p sampling (0.0-1.0). Default: 0.95 |
repetition_penalty |
float | Optional | Repetition penalty (1.0-2.0). Default: 1.3 |
Response (SSE Stream)
200
Audio chunk
{
"audio": "<base64-encoded-pcm16>",
"sample_rate": 32000
}
Audio Format: PCM16 (16-bit signed integer), 32kHz sample rate, mono channel, little-endian.
GET
/api/v1/config
Get configuration options
Response
{
"dialects": [{ "value": "hindi_delhi", "label": "Delhi" }, ...],
"ages": [{ "value": "child", "label": "Child" }, ...],
"sample_rate": 32000,
"sampling_defaults": { "temperature": 0.9, ... }
}
GET
/api/v1/health
Health check
Response
{
"status": "healthy",
"timestamp": "2024-01-15T10:30:00Z",
"backend_status": "healthy"
}
WS
/api/v1/ws/synthesize
WebSocket for streaming text input & audio output
Protocol
Connect via WebSocket and exchange JSON messages for configuration and text, receive binary audio chunks.
1. Send Configuration (optional)
{
"type": "config",
"dialect": "hindi_delhi",
"age": "middle-aged",
"temperature": 0.9,
"top_p": 0.95,
"repetition_penalty": 1.3
}
2. Stream Text Chunks
{
"type": "text",
"content": "नमस्ते, ",
"is_final": false
}
// Send multiple chunks, set is_final: true on last chunk
3. Receive Audio
Binary frames: Raw PCM16 audio (32kHz, mono, little-endian)
Text frames: Status/error messages as JSON
// Status messages: { "type": "status", "message": "synthesizing" } { "type": "done", "total_audio_bytes": 123456 } { "type": "error", "message": "..." }
Python Example
import asyncio import websockets import json async def stream_tts(): uri = "wss://hindi.heypixa.ai/api/v1/ws/synthesize" async with websockets.connect(uri) as ws: # Send config await ws.send(json.dumps({ "type": "config", "dialect": "hindi_delhi" })) # Stream text chunks chunks = ["नमस्ते, ", "आप कैसे ", "हैं?"] for i, chunk in enumerate(chunks): await ws.send(json.dumps({ "type": "text", "content": chunk, "is_final": i == len(chunks) - 1 })) # Receive audio audio = b"" async for msg in ws: if isinstance(msg, bytes): audio += msg else: data = json.loads(msg) if data.get("type") == "done": break # Save audio (PCM16, 32kHz, mono) import wave with wave.open("output.wav", "wb") as f: f.setnchannels(1) f.setsampwidth(2) f.setframerate(32000) f.writeframes(audio) asyncio.run(stream_tts())
WebSocket URL:
wss://hindi.heypixa.ai/api/v1/ws/synthesize (use ws:// for local testing)
Available Dialects
| Value | Label | Region |
|---|---|---|
hindi_delhi | Delhi | National Capital Region |
hindi_rajasthan | Rajasthan | Western India |
hindi_himachal-pradesh | Himachal Pradesh | Northern India |
hindi_haryana | Haryana | Northern India |
hindi_uttarakhand | Uttarakhand | Northern India |
hindi_punjab | Punjab | Northern India |
hindi_bhojpuri-(eastern-up,-bihar) | Bhojpuri | Eastern UP, Bihar |
hindi_lucknow-(uttar-pradesh) | Lucknow | Uttar Pradesh |
hindi_south-india | South India | Southern India |
hindi_madhya-pradesh | Madhya Pradesh | Central India |
hindi_gujarat | Gujarat | Western India |