Google integrates Lyria 3 into Gemini for music from images and video
Google integrates Lyria 3 into Gemini for music from images and video
Google added Lyria 3, a DeepMind music-generation model, to Gemini, allowing users to create short tracks from images and videos.
Model capabilities
Lyria 3 produces short musical pieces up to 30 seconds, combining melody, harmony and rhythm based on visual input or text prompts.
The model supports custom lyrics and a broad palette of genres, spanning jingles to ambient compositions for diverse creative needs.
- Image input: upload a photo and optionally add style descriptors to guide tempo, instrumentation, mood and lyrical themes.
- Video input: short clips inform rhythmic patterns and arrangement, enabling music that matches motion and scene dynamics.
- Text-only: users may request a track from a textual prompt; the system can generate both musical structure and accompanying words.
Access and content tracing
The feature is currently available in beta to users aged 18+ via the Gemini application on supported platforms.
Each generated track is embedded with a concealed watermark intended for automatic identification of AI-origin audio during later analysis.
Google says the watermark helps attribute content without exposing user data or altering audio quality in generated output.
Use cases and guidelines
The integration expands Gemini's multimodal capabilities, enabling creators to turn images and videos into short musical pieces for projects and prototypes.
Details on licensing, permitted use and content policies for generated audio are provided through Gemini's terms and usage documentation.
Related posts

