Luna TTS provides high-quality Hindi speech synthesis with support for multiple dialects, voice customization, and emotion expression. Audio is streamed in real-time via Server-Sent Events (SSE) for low-latency playback.

11 Hindi dialects
Voice customization
Emotion tags
Real-time streaming
Low latency (~300ms)
32kHz audio

Quick Start

import base64, json, requests, wave

# Make streaming request
response = requests.post(
    "https://hindi.heypixa.ai/api/v1/synthesize",
    json={
        "text": "<calm>नमस्ते, आप कैसे हैं?",
        "dialect": "hindi_delhi",
        "age": "middle-aged"
    },
    stream=True
)

# Collect audio chunks
audio_data = b""
for line in response.iter_lines():
    if line:
        line = line.decode('utf-8')
        if line.startswith('data: '):
            data = line[6:]
            if data == '[DONE]': break
            obj = json.loads(data)
            if 'audio' in obj:
                audio_data += base64.b64decode(obj['audio'])

# Save to WAV
with wave.open("output.wav", "wb") as wav:
    wav.setnchannels(1)
    wav.setsampwidth(2)
    wav.setframerate(32000)
    wav.writeframes(audio_data)
const response = await fetch('https://hindi.heypixa.ai/api/v1/synthesize', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({
    text: '<calm>नमस्ते, आप कैसे हैं?',
    dialect: 'hindi_delhi'
  })
});

const reader = response.body.getReader();
const decoder = new TextDecoder();
let buffer = '';

while (true) {
  const { done, value } = await reader.read();
  if (done) break;
  
  buffer += decoder.decode(value, { stream: true });
  
  while (buffer.includes('\n\n')) {
    const [event, rest] = buffer.split('\n\n', 2);
    buffer = rest || '';
    
    if (event.startsWith('data: ')) {
      const data = event.slice(6);
      if (data === '[DONE]') return;
      
      const obj = JSON.parse(data);
      if (obj.audio) {
        // Decode and play audio chunk
      }
    }
  }
}
curl -X POST "https://hindi.heypixa.ai/api/v1/synthesize" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "<calm>नमस्ते, आप कैसे हैं?",
    "dialect": "hindi_delhi",
    "age": "middle-aged"
  }'

API Endpoints

POST /api/v1/synthesize Stream audio synthesis

Request Body

Parameter Type Required Description
text string Required Hindi text to synthesize (max 5000 chars)
dialect string Optional Hindi dialect. Default: hindi_delhi
age string Optional Speaker age. Default: middle-aged
temperature float Optional Sampling temperature (0.0-2.0). Default: 0.9
top_p float Optional Top-p sampling (0.0-1.0). Default: 0.95
repetition_penalty float Optional Repetition penalty (1.0-2.0). Default: 1.3

Response (SSE Stream)

200 Audio chunk
{
  "audio": "<base64-encoded-pcm16>",
  "sample_rate": 32000
}
ℹ️
Audio Format: PCM16 (16-bit signed integer), 32kHz sample rate, mono channel, little-endian.
GET /api/v1/config Get configuration options

Response

{
  "dialects": [{ "value": "hindi_delhi", "label": "Delhi" }, ...],
  "ages": [{ "value": "child", "label": "Child" }, ...],
  "sample_rate": 32000,
  "sampling_defaults": { "temperature": 0.9, ... }
}
GET /api/v1/health Health check

Response

{
  "status": "healthy",
  "timestamp": "2024-01-15T10:30:00Z",
  "backend_status": "healthy"
}
WS /api/v1/ws/synthesize WebSocket for streaming text input & audio output

Protocol

Connect via WebSocket and exchange JSON messages for configuration and text, receive binary audio chunks.

1. Send Configuration (optional)

{
  "type": "config",
  "dialect": "hindi_delhi",
  "age": "middle-aged",
  "temperature": 0.9,
  "top_p": 0.95,
  "repetition_penalty": 1.3
}

2. Stream Text Chunks

{
  "type": "text",
  "content": "नमस्ते, ",
  "is_final": false
}
// Send multiple chunks, set is_final: true on last chunk

3. Receive Audio

Binary frames: Raw PCM16 audio (32kHz, mono, little-endian)

Text frames: Status/error messages as JSON

// Status messages:
{ "type": "status", "message": "synthesizing" }
{ "type": "done", "total_audio_bytes": 123456 }
{ "type": "error", "message": "..." }

Python Example

import asyncio
import websockets
import json

async def stream_tts():
    uri = "wss://hindi.heypixa.ai/api/v1/ws/synthesize"
    async with websockets.connect(uri) as ws:
        # Send config
        await ws.send(json.dumps({
            "type": "config",
            "dialect": "hindi_delhi"
        }))
        
        # Stream text chunks
        chunks = ["नमस्ते, ", "आप कैसे ", "हैं?"]
        for i, chunk in enumerate(chunks):
            await ws.send(json.dumps({
                "type": "text",
                "content": chunk,
                "is_final": i == len(chunks) - 1
            }))
        
        # Receive audio
        audio = b""
        async for msg in ws:
            if isinstance(msg, bytes):
                audio += msg
            else:
                data = json.loads(msg)
                if data.get("type") == "done":
                    break
        
        # Save audio (PCM16, 32kHz, mono)
        import wave
        with wave.open("output.wav", "wb") as f:
            f.setnchannels(1)
            f.setsampwidth(2)
            f.setframerate(32000)
            f.writeframes(audio)

asyncio.run(stream_tts())
ℹ️
WebSocket URL: wss://hindi.heypixa.ai/api/v1/ws/synthesize (use ws:// for local testing)

Available Dialects

Value Label Region
hindi_delhiDelhiNational Capital Region
hindi_rajasthanRajasthanWestern India
hindi_himachal-pradeshHimachal PradeshNorthern India
hindi_haryanaHaryanaNorthern India
hindi_uttarakhandUttarakhandNorthern India
hindi_punjabPunjabNorthern India
hindi_bhojpuri-(eastern-up,-bihar)BhojpuriEastern UP, Bihar
hindi_lucknow-(uttar-pradesh)LucknowUttar Pradesh
hindi_south-indiaSouth IndiaSouthern India
hindi_madhya-pradeshMadhya PradeshCentral India
hindi_gujaratGujaratWestern India

Emotion Tags

Add emotion/style tags to your text to control the speaking style:

Tag Description Example
<calm>Calm, peaceful tone<calm>सब ठीक है।
<angry>Angry expression<angry>ये क्या किया!
<happy>Happy, cheerful tone<happy>बहुत खुशी हुई!
<sad>Sad, melancholic<sad>बहुत दुख हुआ...
<slow>Slow, deliberate<slow>ध्यान से सुनो...
<threatening>Threatening tone<threatening>अभी भुगतान करो।
<sarcastic>Sarcastic tone<sarcastic>वाह!
<shouting>Loud, shouting<shouting>रुको!

Try It Out

Ready to test the API?

Open Playground →