ACE-Step 1.5 XL is a free, open-source music model that beats Suno on benchmarks
ModelScope has released ACE-Step 1.5 XL, an open-source music generation model that outscores Suno v5 on audio quality benchmarks and supports everything from text-to-music to style transfer and vocal extraction.
ModelScope just released ACE-Step 1.5 XL, a new open-source music generation model. It uses a 4B parameter diffusion transformer decoder paired with a language model planner, and covers text-to-music generation, style transfer, vocal extraction, and track layering in a single package.
On SongEval, it scores 4.79 overall quality against Suno v5’s 4.72, and leads all reported models on style alignment at 47.9. It also claims 10 to 120x faster generation than some alternatives, depending on hardware and variant.
The release comes with three variants:
- XL Base: The most flexible option, covering all supported tasks including advanced editing and layer manipulation. Best suited for fine-tuning and LoRA training.
- XL SFT: Supervised fine-tuned for highest audio quality. Uses classifier-free guidance for tighter prompt adherence.
- XL Turbo: Distilled to 8 inference steps with no CFG required. Fastest of the three, built for quick iteration.
All three run on a minimum of 12GB VRAM, with 20 to 24GB recommended for full quality. Smaller non-XL variants are available for under 4GB VRAM.
The model supports LoRA fine-tuning from a small number of songs, which lets users train it toward a specific style or voice without starting from scratch. The weights are released under an MIT license, and the team says training data consists of licensed and royalty-free material.
Open-source music generation has been closing the gap with commercial tools steadily. A model that matches or beats Suno on benchmarks while running locally is a meaningful step in that direction.
Weights are on Hugging Face.
If you need on-demand GPUs for training, fine-tuning, inference, or running open-source models, give RunPod a try.
- Available hardware: H100, H200, A100, L40S, RTX 4090, RTX 5090, and 30+ more
- Cost: significantly cheaper than AWS or GCP, billed per second, no contracts
- Setup: spins up in under a minute, 30+ regions worldwide

Get the core business tech news delivered straight to your inbox. We track AI, automation, SaaS, and cybersecurity so you don't have to.
Just read what you want, and be done with it.





