AudioShake, a leader in AI sound separation technology, has launched Multi-Speaker.
The company says itโs a powerful new model designed to separate an unlimited number of speakers into individual audio tracks.
Itโs the first model of its kind to achieve multi-speaker separation with high-resolution audio opening up new creative uses for voice AI, film, podcasts, UGC, and TV content.
Wondercraft has integrated AudioShakeโs Multi-Speaker into its audio studio so users could separate generated podcasts from NotebookLM into distinct speaker tracks, giving them more control over the conversation and final edit.
Multi-Speaker leverages AudioShakeโs proprietary AI technology to handle complex audio environments – including crowd dialogues, panel discussions, and fast-paced interviews. It separates them into individual speaker streams.
This model allows users to easily isolate individual speakers to improve transcription and captioning accuracy, enable more precise editing workflows, isolate voice for speech AI tasks, and clean up overlapping dialogue for dubbing and localisation.
“With the launch of Multi-Speaker, weโre pushing the boundaries of whatโs possible in sound separation,โ said Jessica Powell, CEO of AudioShake. โThis model is designed for any professional dealing with complicated audio mixes – whether in broadcasting, film, or even transcription.
Multi-Speaker makes it easier than ever to work with voices that were previously impossible to isolate.โ
Fabian-Robert Stotter, AudioShakeโs Head of Research, emphasised how the new model was designed to handle real-world scenarios: โSeparating multiple voices in overlapping situations is one of the most difficult challenges in audio separation. Our team worked to create a solution that is not only robust but accurate, even in highly challenging environments.โ
The Multi-Speaker model represents a significant advancement for professionals in the media and content industries. By providing a powerful tool for separating overlapping voices, it enhances both workflow efficiency and audio clarity for uses including:
- Media & Entertainment: Achieve cleaner dialogue tracks, even in chaotic soundscapes, enhancing the overall listening experience for audiences
- Localisation & Dubbing: Translators and voice-over artists can work with precise, isolated speech tracks, enabling more accurate and natural dubbing, especially in fast-paced or overlapping dialogue scenarios
- Transcription & Captioning Services: Provide clearer and more accurate transcriptions of conversations for journalism, accessibility, and automated summarization purposes
- Live Broadcasting & Events: Broadcasters can extract distinct voices for clearer speech during interviews, sports commentary, and panel discussions, improving audience engagement and understanding
- AI Voice Synthesis & Research: Enhanced separation allows for more realistic and natural-sounding AI-generated voices, improving user interactions and applications in voice recognition and customer service
Multi-Speaker is now available through Audioshakeโs web-based platform and API. For inquiries or to experience it firsthand, reach out to info@audioshake.ai.