Luna Hindi TTS API - hindi.heypixa.ai

Luna TTS provides high-quality Hindi speech synthesis with support for multiple dialects, voice customization, and emotion expression. Audio is streamed in real-time via Server-Sent Events (SSE) for low-latency playback.

✓ 11 Hindi dialects

✓ Voice customization

✓ Emotion tags

✓ Real-time streaming

✓ Low latency (~300ms)

✓ 32kHz audio

Quick Start

import base64, json, requests, wave

# Make streaming request
response = requests.post(
    "https://hindi.heypixa.ai/api/v1/synthesize",
    json={
        "text": "<calm>नमस्ते, आप कैसे हैं?",
        "dialect": "hindi_delhi",
        "age": "middle-aged"
    },
    stream=True
)

# Collect audio chunks
audio_data = b""
for line in response.iter_lines():
    if line:
        line = line.decode('utf-8')
        if line.startswith('data: '):
            data = line[6:]
            if data == '[DONE]': break
            obj = json.loads(data)
            if 'audio' in obj:
                audio_data += base64.b64decode(obj['audio'])

# Save to WAV
with wave.open("output.wav", "wb") as wav:
    wav.setnchannels(1)
    wav.setsampwidth(2)
    wav.setframerate(32000)
    wav.writeframes(audio_data)

const response = await fetch('https://hindi.heypixa.ai/api/v1/synthesize', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({
    text: '<calm>नमस्ते, आप कैसे हैं?',
    dialect: 'hindi_delhi'
  })
});

const reader = response.body.getReader();
const decoder = new TextDecoder();
let buffer = '';

while (true) {
  const { done, value } = await reader.read();
  if (done) break;
  
  buffer += decoder.decode(value, { stream: true });
  
  while (buffer.includes('\n\n')) {
    const [event, rest] = buffer.split('\n\n', 2);
    buffer = rest || '';
    
    if (event.startsWith('data: ')) {
      const data = event.slice(6);
      if (data === '[DONE]') return;
      
      const obj = JSON.parse(data);
      if (obj.audio) {
        // Decode and play audio chunk
      }
    }
  }
}

curl -X POST "https://hindi.heypixa.ai/api/v1/synthesize" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "<calm>नमस्ते, आप कैसे हैं?",
    "dialect": "hindi_delhi",
    "age": "middle-aged"
  }'

API Endpoints

POST /api/v1/synthesize Stream audio synthesis

Request Body

Parameter	Type	Required	Description
`text`	string	Required	Hindi text to synthesize (max 5000 chars)
`dialect`	string	Optional	Hindi dialect. Default: `hindi_delhi`
`age`	string	Optional	Speaker age. Default: `middle-aged`
`temperature`	float	Optional	Sampling temperature (0.0-2.0). Default: `0.9`
`top_p`	float	Optional	Top-p sampling (0.0-1.0). Default: `0.95`
`repetition_penalty`	float	Optional	Repetition penalty (1.0-2.0). Default: `1.3`

Response (SSE Stream)

200 Audio chunk

{
  "audio": "<base64-encoded-pcm16>",
  "sample_rate": 32000
}

ℹ️

Audio Format: PCM16 (16-bit signed integer), 32kHz sample rate, mono channel, little-endian.

GET /api/v1/config Get configuration options

Response

{
  "dialects": [{ "value": "hindi_delhi", "label": "Delhi" }, ...],
  "ages": [{ "value": "child", "label": "Child" }, ...],
  "sample_rate": 32000,
  "sampling_defaults": { "temperature": 0.9, ... }
}

GET /api/v1/health Health check

Response

{
  "status": "healthy",
  "timestamp": "2024-01-15T10:30:00Z",
  "backend_status": "healthy"
}

WS /api/v1/ws/synthesize WebSocket for streaming text input & audio output

Protocol

Connect via WebSocket and exchange JSON messages for configuration and text, receive binary audio chunks.

1. Send Configuration (optional)

{
  "type": "config",
  "dialect": "hindi_delhi",
  "age": "middle-aged",
  "temperature": 0.9,
  "top_p": 0.95,
  "repetition_penalty": 1.3
}

2. Stream Text Chunks

{
  "type": "text",
  "content": "नमस्ते, ",
  "is_final": false
}
// Send multiple chunks, set is_final: true on last chunk

3. Receive Audio

Binary frames: Raw PCM16 audio (32kHz, mono, little-endian)

Text frames: Status/error messages as JSON

// Status messages:
{ "type": "status", "message": "synthesizing" }
{ "type": "done", "total_audio_bytes": 123456 }
{ "type": "error", "message": "..." }

Python Example

import asyncio
import websockets
import json

async def stream_tts():
    uri = "wss://hindi.heypixa.ai/api/v1/ws/synthesize"
    async with websockets.connect(uri) as ws:
        # Send config
        await ws.send(json.dumps({
            "type": "config",
            "dialect": "hindi_delhi"
        }))
        
        # Stream text chunks
        chunks = ["नमस्ते, ", "आप कैसे ", "हैं?"]
        for i, chunk in enumerate(chunks):
            await ws.send(json.dumps({
                "type": "text",
                "content": chunk,
                "is_final": i == len(chunks) - 1
            }))
        
        # Receive audio
        audio = b""
        async for msg in ws:
            if isinstance(msg, bytes):
                audio += msg
            else:
                data = json.loads(msg)
                if data.get("type") == "done":
                    break
        
        # Save audio (PCM16, 32kHz, mono)
        import wave
        with wave.open("output.wav", "wb") as f:
            f.setnchannels(1)
            f.setsampwidth(2)
            f.setframerate(32000)
            f.writeframes(audio)

asyncio.run(stream_tts())

ℹ️

WebSocket URL: wss://hindi.heypixa.ai/api/v1/ws/synthesize (use ws:// for local testing)

Available Dialects

Value	Label	Region
`hindi_delhi`	Delhi	National Capital Region
`hindi_rajasthan`	Rajasthan	Western India
`hindi_himachal-pradesh`	Himachal Pradesh	Northern India
`hindi_haryana`	Haryana	Northern India
`hindi_uttarakhand`	Uttarakhand	Northern India
`hindi_punjab`	Punjab	Northern India
`hindi_bhojpuri-(eastern-up,-bihar)`	Bhojpuri	Eastern UP, Bihar
`hindi_lucknow-(uttar-pradesh)`	Lucknow	Uttar Pradesh
`hindi_south-india`	South India	Southern India
`hindi_madhya-pradesh`	Madhya Pradesh	Central India
`hindi_gujarat`	Gujarat	Western India

Emotion Tags

Add emotion/style tags to your text to control the speaking style:

Tag	Description	Example
`<calm>`	Calm, peaceful tone	`<calm>सब ठीक है।`
`<angry>`	Angry expression	`<angry>ये क्या किया!`
`<happy>`	Happy, cheerful tone	`<happy>बहुत खुशी हुई!`
`<sad>`	Sad, melancholic	`<sad>बहुत दुख हुआ...`
`<slow>`	Slow, deliberate	`<slow>ध्यान से सुनो...`
`<threatening>`	Threatening tone	`<threatening>अभी भुगतान करो।`
`<sarcastic>`	Sarcastic tone	`<sarcastic>वाह!`
`<shouting>`	Loud, shouting	`<shouting>रुको!`

Try It Out

Ready to test the API?

Open Playground →