Why ResponsiveVoice?
ResponsiveVoice is an open-source, TypeScript-first text-to-speech layer for the web. It is the integration and control layer between your site and one or more voice engines — browser-native, ResponsiveVoice's hosted voices, or premium providers you bring yourself.
What ResponsiveVoice gives you
Section titled “What ResponsiveVoice gives you”- Open source, MIT-licensed. The client (
@responsivevoice/core) is a drop-in replacement for the legacyresponsivevoice.js— samespeak/cancel/pause/resumeAPI, now typed and tree-shakeable. - Browser-native with automatic fallback. Uses the Web Speech API when the device has a matching voice, and falls back to hosted server voices when it doesn't — one API, consistent behavior across browsers.
- 100+ voices across 50+ languages, fetched and cached at runtime, so the catalog improves without a package upgrade.
- Bring Your Own Key (BYOK). Route premium voices from Google Cloud, Microsoft Azure, and OpenAI through ResponsiveVoice using your own provider key — you keep the provider relationship and the per-character billing, and gain RV's browser integration, fallback, streaming, and player features on top.
- Streaming playback. HTTP audio or WebSocket, so speech starts before the full clip is ready.
- Predictable pricing. A free plan plus fixed-tier plans, instead of metering every character.
- REST + WebSocket API documented by an OpenAPI 3.1 specification, for server-side and non-browser use.
How it compares
Section titled “How it compares”The premium providers below produce excellent neural audio. The distinction is that ResponsiveVoice is the integration layer — and with BYOK it can front those same providers rather than competing with them.
| Capability | ResponsiveVoice | ElevenLabs | Azure / Google Cloud |
|---|---|---|---|
| Open source (MIT) | Yes | No | No |
| Drop-in browser script | Yes | Via your own backend | Via your own backend |
| Browser-native + fallback | Yes | Cloud only | Cloud only |
| Premium neural voices | Via BYOK (Azure, OpenAI, Google) | Native | Native |
| Voice cloning | No | Yes | Limited |
| Pricing model | Free plan + fixed tiers | Per-character / subscription | Per-character |
| Streaming | HTTP + WebSocket | Yes | Yes |
When to choose what
Section titled “When to choose what”- Choose ResponsiveVoice when you want a drop-in, open-source browser TTS with native-plus-fallback behavior, predictable pricing, and the option to bring premium provider voices via BYOK without re-architecting — read-aloud, accessibility, language learning, article narration, and announcements.
- Choose a provider directly when best-in-class voice cloning or maximum expressive realism is the product itself, and you're set up to manage the integration and per-character billing.
Many sites use both: ResponsiveVoice for the browser integration and player, with a premium provider supplied via BYOK for the voices.
Next steps
Section titled “Next steps”- Installation — add the script or install from npm.
- Voice Selection — filter and resolve voices.
- REST API Overview — server-side synthesis.