Open-source ACE‑Step expands models and audio interfaces
Open-source ACE‑Step expands models and audio interfaces
The open-source project ACE‑Step is expanding its model library, community LORAs, and user interfaces for track generation and LORA training. Interfaces now include visualizers, timeline editing, and WebGPU-accelerated stem extraction, while developers maintain both open-source and commercial offerings.
ACE‑Step UI and core features
The ACE‑Step UI adopts a workflow similar to other music generation tools, offering model selection, reference track upload, and a cover mode. Users can adjust voice, BPM, and lyrics, enable an enhancer, and tune typical parameters such as duration, track count, and inference steps.
Alongside standard controls, the interface exposes advanced options including inference method selection, an LLM-assisted lyrics pipeline, a "Thinking" mode for iterative refinement, and audio inpainting for replacing sections with generated material.
Per-track capabilities
For each track, users can generate a simple video visualizer synchronized to the music, edit audio on a timeline using the open-source AudioMass editor, and extract stems with the open-source Demucs tool. Demucs in this setup runs through WebGPU, enabling browser-accelerated processing without separate native tooling.
Performance, resource use and installation
The package automates downloads and launches required components in separate windows, simplifying setup for end users. Resource consumption is reported at 4+ GB without an LLM and up to 12 GB when an LLM is enabled, with typical generation times under <6 sec per track on a 4090.
Batch generation can load multiple items into video memory concurrently and may exhaust VRAM; an alternative is running through Comfy with dynamic GPU memory loading to mitigate spikes. Installations are possible via an upgraded installer named Pinokio or directly from the project repository.
Related interfaces and ecosystem
Other interfaces built on ACE‑Step include Side‑Step, which mimics a tape-recorder aesthetic and emphasizes LORA training workflows, and AceJam, which converts descriptive prompts into audio using a quantized Qwen model together with ACE‑Step backends.
Comfy recently added workflows for ACE‑Step 1.5 XL models aimed at higher-quality outputs, although structural coherence is not always guaranteed. Sound quality currently trails leading tools such as Suno and Udio, but the project already provides a polished UI and a growing open-source ecosystem.
Commercial offerings and outlook
To balance the ecosystem, the project team also supports commercial services: ACE Music for simple generation and ACE Studio for studio-oriented workflows. The open-source community and developer efforts continue to advance interfaces and model support across the stack.
Related posts

