Bytedance releases multi-speaker audio generator SeedAudio 1.0

2049.news · 26.06.2026, 08:15:03

Bytedance releases multi-speaker audio generator SeedAudio 1.0


Bytedance has introduced SeedAudio 1.0, an audio generation model capable of producing speech, effects and music within the same scene.

Model capabilities

The system can synthesize multiple speakers in one mix and accepts up to 3 audio references for voice, emotion and character guidance. Users may provide a text prompt, an example recording or an image of a character to generate a voice that matches the supplied references.

Demonstrations and observed quality

In a released clip, a dubbing example derived from Seedance 2 was shown; the author reported improved results but did not publish the original for direct comparison. Overall, environmental sounds such as a bottle on a table align well with the visuals and contribute to cohesive scenes.

Speech quality appears generally natural for several characters, though one female speaker exhibits a robotic tone on her first line and a more natural tone on her second, indicating some instability in consistency. If temporal stability and lip synchronization are refined and integrated, the system could more directly compete with established voice platforms.

Availability and pricing

SeedAudio 1.0 is currently distributed exclusively through Fal. The service is offered at a rate of $0,075/min for the released tier.


Related posts

Gemini Omni Video Prioritizes Conversational UX Over Production
ByteDance unveils Seedance 2.5 video generator at Volcengine FORCE
Scroll down to load next post