Nari-labs releases open-source Dia2 streaming speech generator

2049.news · 25.11.2025, 13:20:01

Nari-labs releases open-source Dia2 streaming speech generator


Nari-labs published an open-source speech generation model called Dia2 that supports streaming and per-speaker voice samples.

Model overview

Dia2 provides a streaming mode that can begin producing audio from the first words without waiting for full text preprocessing, enabling lower perceived latency for interactive scenarios.

  • Variants: 1B and 2B model sizes.
  • Language: generates up to 2 minutes in English; Russian is not supported.
  • Hardware: designed to run on GPUs with around 8 GB VRAM or less.

Behavior and limitations

Early testing shows variable outputs: Dia2 may introduce unsolicited words, exhibit inconsistent loudness, and deliver speech with rapid pacing and reduced pausing.

Developers note that stable speaker rendering requires either supplying per-speaker audio samples as prefixes or fine-tuning the model on target voices.

License and ecosystem

The project is released under the Apache-2 license, which is permissive and intended to facilitate commercial and community adoption of the code and models.

While Nari-labs positions Dia2 as less mature in quality than established commercial offerings, the maintainers expect community contributions to address current instability and improve naturalness over time.


Related posts

Comfy updates: migration to Nodes 2.0 powered by Vue
Kling releases Kling V 2.6 video model with audio
Scroll down to load next post