AudioShake, a leader in AI sound separation technology, has launched Multi-Speaker.
The company says it’s a powerful new model designed to separate an unlimited number of speakers into individual audio tracks.
It’s the first model of its kind to achieve multi-speaker separation with high-resolution audio opening up new creative uses for voice AI, film, podcasts, UGC, and TV content.
Wondercraft has integrated AudioShake’s Multi-Speaker into its audio studio so users could separate generated podcasts from NotebookLM into distinct speaker tracks, giving them more control over the conversation and final edit.
Multi-Speaker leverages AudioShake’s proprietary AI technology to handle complex audio environments – including crowd dialogues, panel discussions, and fast-paced interviews. It separates them into individual speaker streams.
This model allows users to easily isolate individual speakers to improve transcription and captioning accuracy, enable more precise editing workflows, isolate voice for speech AI tasks, and clean up overlapping dialogue for dubbing and localisation.
“With the launch of Multi-Speaker, we’re pushing the boundaries of what’s possible in sound separation,” said Jessica Powell, CEO of AudioShake. “This model is designed for any professional dealing with complicated audio mixes – whether in broadcasting, film, or even transcription.
Multi-Speaker makes it easier than ever to work with voices that were previously impossible to isolate.”
Fabian-Robert Stotter, AudioShake’s Head of Research, emphasised how the new model was designed to handle real-world scenarios: “Separating multiple voices in overlapping situations is one of the most difficult challenges in audio separation. Our team worked to create a solution that is not only robust but accurate, even in highly challenging environments.”
The Multi-Speaker model represents a significant advancement for professionals in the media and content industries. By providing a powerful tool for separating overlapping voices, it enhances both workflow efficiency and audio clarity for uses including:
- Media & Entertainment: Achieve cleaner dialogue tracks, even in chaotic soundscapes, enhancing the overall listening experience for audiences
- Localisation & Dubbing: Translators and voice-over artists can work with precise, isolated speech tracks, enabling more accurate and natural dubbing, especially in fast-paced or overlapping dialogue scenarios
- Transcription & Captioning Services: Provide clearer and more accurate transcriptions of conversations for journalism, accessibility, and automated summarization purposes
- Live Broadcasting & Events: Broadcasters can extract distinct voices for clearer speech during interviews, sports commentary, and panel discussions, improving audience engagement and understanding
- AI Voice Synthesis & Research: Enhanced separation allows for more realistic and natural-sounding AI-generated voices, improving user interactions and applications in voice recognition and customer service
Multi-Speaker is now available through Audioshake’s web-based platform and API. For inquiries or to experience it firsthand, reach out to info@audioshake.ai.