Skip to content

Synthesize text to speech (GET)

GET
/v2/text/synthesize
curl "https://texttospeech.responsivevoice.org/v2/text/synthesize?text=Hello+world&voice=US+English+Female" -H 'X-API-Key: YOUR_API_KEY' -H 'X-API-Secret: YOUR_API_SECRET' --output speech.mp3

Converts text to speech audio via query parameters. This endpoint is idempotent and CDN-cacheable — responses include Cache-Control headers with long TTLs (1 day client, 7 days CDN). Prefer this over POST for deterministic requests where caching is beneficial. The same parameters always produce the same audio, making it safe for CDN edge caching.

text
required
string

The text to convert to speech. Supports plain text and SSML for compatible engines.

voice
string

ResponsiveVoice name (e.g. “UK English Male”). Resolved server-side to engine + lang. Either voice or lang must be provided.

lang
string

BCP-47 language code (e.g. “en-US”, “pt-BR”, “ja-JP”). Required unless voice is provided.

engine
string
Allowed values: g1 g2 g3 g5 gwn msv oai

TTS engine code: g1, g2, g3, g5 (standard engines), gwn (Google Cloud WaveNet, BYOK), msv (Azure Speech, BYOK), oai (OpenAI TTS, BYOK). Defaults to g1.

name
string

System voice name for the TTS engine (e.g. “rjs”). Use the “voice” parameter instead for friendly names like “UK English Male”.

gender
string
Allowed values: male female

Preferred voice gender: “male” or “female”. Used when no specific voice name is provided.

pitch
number | null
<= 2

Voice pitch from 0.0 (lowest) to 2.0 (highest). 1.0 is normal.

rate
number | null
<= 2

Speech rate from 0.0 (slowest) to 2.0 (fastest). 1.0 is normal.

volume
number | null
<= 1

Audio volume from 0.0 (silent) to 1.0 (full volume). Default is 1.0.

format
string
Allowed values: mp3 ogg wav

Output audio format. Defaults to the engine’s native format (mp3 for most, ogg for g5). Supported: g1/g2/g3 → mp3 only; g5 → ogg only; gwn → mp3, ogg; msv/oai → mp3, ogg, wav.

Audio binary data. Content-Type matches the requested format (audio/mpeg, audio/ogg, or audio/wav). Includes RV-Cached header indicating cache provenance and RV-Prosody-Applied listing the prosody knobs (pitch/rate/volume) the server applied upstream.

Media type audio/mpeg
string format: binary
RV-Prosody-Applied
required

Comma-separated subset of {pitch, rate, volume} the server applied upstream for this call. Clients use this to decide whether to apply their own client-side fallback for the remaining knobs.

string

Comma-separated subset of {pitch, rate, volume} the server applied upstream for this call. Clients use this to decide whether to apply their own client-side fallback for the remaining knobs.

X-RateLimit-Limit
required

Maximum requests permitted for the authenticated key per minute.

integer

Maximum requests permitted for the authenticated key per minute.

X-RateLimit-Remaining
required

Requests remaining in the current per-minute window; 0 once the per-minute limit is reached. Reflects the per-minute limit, not the burst check — a burst 429 may report a value above 0.

integer

Requests remaining in the current per-minute window; 0 once the per-minute limit is reached. Reflects the per-minute limit, not the burst check — a burst 429 may report a value above 0.

No content — returned when volume=0 was requested; the server skips upstream synthesis entirely and emits no audio body.

RV-Prosody-Applied
required

Always “volume” for 204 responses — server skipped synthesis.

string

Always “volume” for 204 responses — server skipped synthesis.

X-RateLimit-Limit
required

Maximum requests permitted for the authenticated key per minute.

integer

Maximum requests permitted for the authenticated key per minute.

X-RateLimit-Remaining
required

Requests remaining in the current per-minute window; 0 once the per-minute limit is reached. Reflects the per-minute limit, not the burst check — a burst 429 may report a value above 0.

integer

Requests remaining in the current per-minute window; 0 once the per-minute limit is reached. Reflects the per-minute limit, not the burst check — a burst 429 may report a value above 0.

Invalid synthesis request — missing required parameters or values out of range

Media type application/json

API error response

object
error
required
object
message
required
string
code
string
statusCode
required
number
errors
Example generated
{
"error": {
"message": "example",
"code": "example",
"statusCode": 1,
"errors": "example"
}
}

API key required

Media type application/json

API error response

object
error
required
object
message
required
string
code
string
statusCode
required
number
errors
Example generated
{
"error": {
"message": "example",
"code": "example",
"statusCode": 1,
"errors": "example"
}
}

No provider available for the requested engine code

Media type application/json

API error response

object
error
required
object
message
required
string
code
string
statusCode
required
number
errors
Example generated
{
"error": {
"message": "example",
"code": "example",
"statusCode": 1,
"errors": "example"
}
}

Rate limit exceeded. The body code is RATE_LIMIT_EXCEEDED (per-minute tier limit) or BURST_RATE_LIMIT_EXCEEDED (per API key + client IP burst).

Media type application/json

API error response

object
error
required
object
message
required
string
code
string
statusCode
required
number
errors
Example generated
{
"error": {
"message": "example",
"code": "example",
"statusCode": 1,
"errors": "example"
}
}
X-RateLimit-Limit
required

Maximum requests permitted for the authenticated key per minute.

integer

Maximum requests permitted for the authenticated key per minute.

X-RateLimit-Remaining
required

Requests remaining in the current per-minute window; 0 once the per-minute limit is reached. Reflects the per-minute limit, not the burst check — a burst 429 may report a value above 0.

integer

Requests remaining in the current per-minute window; 0 once the per-minute limit is reached. Reflects the per-minute limit, not the burst check — a burst 429 may report a value above 0.

Retry-After
required

Seconds to wait before retrying. Dynamic — varies with how far over the limit the request went; honor this header rather than assuming a fixed delay.

integer

Seconds to wait before retrying. Dynamic — varies with how far over the limit the request went; honor this header rather than assuming a fixed delay.

Upstream TTS provider returned an error during synthesis

Media type application/json

API error response

object
error
required
object
message
required
string
code
string
statusCode
required
number
errors
Example generated
{
"error": {
"message": "example",
"code": "example",
"statusCode": 1,
"errors": "example"
}
}