Voice Cloning & Synthesis

curl -X POST https://app.dubformer.ai/api/v1/cloning/synthesize \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "text=Hello world, this is a test of our voice cloning technology!" \
  -F "language=en-US" \
  -F "reference_audio_files=@reference_voice_1.wav" \
  -F "reference_audio_files=@reference_voice_2.wav"

# Binary WAV audio file is returned
Content-Type: audio/wav
Content-Length: 524288

# Save to file
curl -X POST https://app.dubformer.ai/api/v1/cloning/synthesize \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "text=Hello world!" \
  -F "language=en-US" \
  -F "[email protected]" \
  --output synthesized_audio.wav

POST

https://app.dubformer.ai

api

cloning

synthesize

curl -X POST https://app.dubformer.ai/api/v1/cloning/synthesize \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "text=Hello world, this is a test of our voice cloning technology!" \
  -F "language=en-US" \
  -F "reference_audio_files=@reference_voice_1.wav" \
  -F "reference_audio_files=@reference_voice_2.wav"

# Binary WAV audio file is returned
Content-Type: audio/wav
Content-Length: 524288

# Save to file
curl -X POST https://app.dubformer.ai/api/v1/cloning/synthesize \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "text=Hello world!" \
  -F "language=en-US" \
  -F "[email protected]" \
  --output synthesized_audio.wav

The Voice Cloning & Synthesis API enables you to generate natural-sounding speech with emotional transfer capabilities. This advanced endpoint uses AI-powered voice cloning technology to synthesize speech that matches the characteristics and emotional tone of reference audio samples.

This endpoint requires a PRO subscription or higher to access voice cloning features.

curl -X POST https://app.dubformer.ai/api/v1/cloning/synthesize \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "text=Hello world, this is a test of our voice cloning technology!" \
  -F "language=en-US" \
  -F "reference_audio_files=@reference_voice_1.wav" \
  -F "reference_audio_files=@reference_voice_2.wav"

Request Parameters

text

string

required

The text content to be synthesized into speech. Maximum length: 5000 characters.

language

string

required

The target language for speech synthesis with locale (e.g., “en-US”, “fr-FR”, “es-ES”). Must match one of the supported languages from Get Options.

reference_audio_files

file[]

required

One or more audio files containing reference voice samples for cloning.Supported formats: WAV, MP3, M4A, FLAC
Maximum file size: 50MB per file
Recommended length: 3-30 seconds for optimal results
Quality requirements: Clear audio with minimal background noise

Response

Success Response (200)

audio

binary

Synthesized audio as a binary WAV file.

Content-Type: audio/wav
File Format: 16-bit PCM WAV, 44.1kHz sample rate

# Binary WAV audio file is returned
Content-Type: audio/wav
Content-Length: 524288

# Save to file
curl -X POST https://app.dubformer.ai/api/v1/cloning/synthesize \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "text=Hello world!" \
  -F "language=en-US" \
  -F "[email protected]" \
  --output synthesized_audio.wav

Authentication Error Handling

⌘I

Getting Started

Voice Synthesis

Error Handling

Request Parameters

Response

Success Response (200)

Getting Started

Voice Synthesis

Error Handling

​Request Parameters

​Response

​Success Response (200)

Request Parameters

Response

Success Response (200)