elevenlabs capabilities
11 mapped capabilities, each graded and dated. This is the diagnosis — the migration guide is the cure.
Capabilities
AI Dubbing
provisionalverified 2 days agoTranslates and re-voices audio or video content into 90+ languages, preserving speaker identity. Offers automatic dubbing (Dubbing v2) and manual Dubbing Studio for fine-grained editing.
AI Music Generation
provisionalverified 2 days agoGenerates original songs with vocals and instrumentals from text prompts using the Music v2 model. Supports genre, mood, and structural customization including mid-track transitions.
AI Sound Effects Generation
provisionalverified 2 days agoGenerates custom royalty-free sound effects and ambient audio from text prompts using a dedicated AI model. Returns multiple distinct samples per generation.
Conversational AI Agents (ElevenAgents)
provisionalverified 2 days agoPlatform for building and deploying real-time voice and chat agents that combine speech recognition, configurable LLMs, and low-latency TTS. Formerly called Conversational AI.
Developer API and SDKs
provisionalverified 2 days agoREST API exposing all ElevenLabs capabilities (TTS, STT, voice cloning, sound effects, music, dubbing, voice agents, speech-to-speech) with official SDKs for Python, TypeScript/JavaScript, Flutter, Swift, and Kotlin.
Pricing Plans and Commercial Rights
provisionalverified 2 days agoSix self-serve subscription tiers plus Enterprise, governing monthly credit allowances, commercial use rights, voice clone slots, audio quality, and workspace seats.
Speech-to-Text (Scribe)
provisionalverified 2 days agoAI transcription system (Scribe) converting audio and video to text with speaker diarization, word-level timestamps, and non-speech event tagging across 90+ languages.
Studio (Long-Form Audio/Video Editor)
provisionalverified 2 days agoTimeline-based end-to-end production environment for creating audiobooks, narrated videos, and long-form audio with AI voice, music, sound effects, and captions.
Text-to-Speech
provisionalverified 2 days agoConverts text into lifelike, emotionally expressive speech using multiple AI models. Supports multilingual synthesis, inline audio tags for emotion and delivery control, and real-time streaming.
Voice Cloning
provisionalverified 2 days agoReplicates a speaker's voice from audio samples using two tiers: Instant Voice Cloning (IVC) for rapid prototyping from short samples, and Professional Voice Cloning (PVC) for near-indistinguishable results via fine-tuned model training.