Synthesize text to speech (GET)
curl "https://texttospeech.responsivevoice.org/v2/text/synthesize?text=Hello+world&voice=US+English+Female" -H 'X-API-Key: YOUR_API_KEY' -H 'X-API-Secret: YOUR_API_SECRET' --output speech.mp3import { ResponsiveVoiceAPIClient } from '@responsivevoice/api-client';
const client = new ResponsiveVoiceAPIClient({ apiKey: 'YOUR_API_KEY', apiSecret: 'YOUR_API_SECRET' });const audio = await client.synthesize({ text: 'Hello world', voice: 'US English Female' });Converts text to speech audio via query parameters. This endpoint is idempotent and CDN-cacheable — responses include Cache-Control headers with long TTLs (1 day client, 7 days CDN). Prefer this over POST for deterministic requests where caching is beneficial. The same parameters always produce the same audio, making it safe for CDN edge caching.
Authorizations
Section titled “Authorizations ”Parameters
Section titled “ Parameters ”Query Parameters
Section titled “Query Parameters ”The text to convert to speech. Supports plain text and SSML for compatible engines.
ResponsiveVoice name (e.g. “UK English Male”). Resolved server-side to engine + lang. Either voice or lang must be provided.
BCP-47 language code (e.g. “en-US”, “pt-BR”, “ja-JP”). Required unless voice is provided.
TTS engine code: g1, g2, g3, g5 (standard engines), gwn (Google Cloud WaveNet, BYOK), msv (Azure Speech, BYOK), oai (OpenAI TTS, BYOK). Defaults to g1.
System voice name for the TTS engine (e.g. “rjs”). Use the “voice” parameter instead for friendly names like “UK English Male”.
Preferred voice gender: “male” or “female”. Used when no specific voice name is provided.
Voice pitch from 0.0 (lowest) to 2.0 (highest). 1.0 is normal.
Speech rate from 0.0 (slowest) to 2.0 (fastest). 1.0 is normal.
Audio volume from 0.0 (silent) to 1.0 (full volume). Default is 1.0.
Output audio format. Defaults to the engine’s native format (mp3 for most, ogg for g5). Supported: g1/g2/g3 → mp3 only; g5 → ogg only; gwn → mp3, ogg; msv/oai → mp3, ogg, wav.
Responses
Section titled “ Responses ”Audio binary data. Content-Type matches the requested format (audio/mpeg, audio/ogg, or audio/wav). Includes RV-Cached header indicating cache provenance and RV-Prosody-Applied listing the prosody knobs (pitch/rate/volume) the server applied upstream.
Headers
Section titled “Headers ”Comma-separated subset of {pitch, rate, volume} the server applied upstream for this call. Clients use this to decide whether to apply their own client-side fallback for the remaining knobs.
Comma-separated subset of {pitch, rate, volume} the server applied upstream for this call. Clients use this to decide whether to apply their own client-side fallback for the remaining knobs.
Maximum requests permitted for the authenticated key per minute.
Maximum requests permitted for the authenticated key per minute.
Requests remaining in the current per-minute window; 0 once the per-minute limit is reached. Reflects the per-minute limit, not the burst check — a burst 429 may report a value above 0.
Requests remaining in the current per-minute window; 0 once the per-minute limit is reached. Reflects the per-minute limit, not the burst check — a burst 429 may report a value above 0.
No content — returned when volume=0 was requested; the server skips upstream synthesis entirely and emits no audio body.
Headers
Section titled “Headers ”Always “volume” for 204 responses — server skipped synthesis.
Always “volume” for 204 responses — server skipped synthesis.
Maximum requests permitted for the authenticated key per minute.
Maximum requests permitted for the authenticated key per minute.
Requests remaining in the current per-minute window; 0 once the per-minute limit is reached. Reflects the per-minute limit, not the burst check — a burst 429 may report a value above 0.
Requests remaining in the current per-minute window; 0 once the per-minute limit is reached. Reflects the per-minute limit, not the burst check — a burst 429 may report a value above 0.
Invalid synthesis request — missing required parameters or values out of range
API error response
object
object
Example generated
{ "error": { "message": "example", "code": "example", "statusCode": 1, "errors": "example" }}API key required
API error response
object
object
Example generated
{ "error": { "message": "example", "code": "example", "statusCode": 1, "errors": "example" }}No provider available for the requested engine code
API error response
object
object
Example generated
{ "error": { "message": "example", "code": "example", "statusCode": 1, "errors": "example" }}Rate limit exceeded. The body code is RATE_LIMIT_EXCEEDED (per-minute tier limit) or BURST_RATE_LIMIT_EXCEEDED (per API key + client IP burst).
API error response
object
object
Example generated
{ "error": { "message": "example", "code": "example", "statusCode": 1, "errors": "example" }}Headers
Section titled “Headers ”Maximum requests permitted for the authenticated key per minute.
Maximum requests permitted for the authenticated key per minute.
Requests remaining in the current per-minute window; 0 once the per-minute limit is reached. Reflects the per-minute limit, not the burst check — a burst 429 may report a value above 0.
Requests remaining in the current per-minute window; 0 once the per-minute limit is reached. Reflects the per-minute limit, not the burst check — a burst 429 may report a value above 0.
Seconds to wait before retrying. Dynamic — varies with how far over the limit the request went; honor this header rather than assuming a fixed delay.
Seconds to wait before retrying. Dynamic — varies with how far over the limit the request went; honor this header rather than assuming a fixed delay.
Upstream TTS provider returned an error during synthesis
API error response
object
object
Example generated
{ "error": { "message": "example", "code": "example", "statusCode": 1, "errors": "example" }}