Fish Audio releases S2-Pro as an open-source alternative to ElevenLabs
Fish Audio launched a powerful new text-to-speech model. The open-source system runs locally on consumer hardware and offers fine-grained emotional control over generated voices.
Fish-themed AI projects are having a massive week on GitHub. Following the recent breakout of the MiroFish prediction engine, Fish Audio (not related to MiroFish) has officially released Fish Speech S2. This open-source text-to-speech model competes directly with paid services like ElevenLabs.
Key Takeaways:
- The hardware requirements: Developers can run this model locally. The base setup requires a consumer GPU with 8 to 12 GB of VRAM.
- The performance specs: The system is built for speed. It hits a real-time factor of 0.195 on an Nvidia H200 GPU. The time-to-first-audio drops below 100 milliseconds for conversational streaming.
- The architecture: The model uses a Dual-AR design. It processes high-level semantics through a 4-billion parameter component. A smaller 400-million parameter component then generates the acoustic details.
- The precise control: The model recognizes over 15,000 unique inline tags. Users type commands like [whisper] or [clears throat] directly into the text prompt. The system executes these cues with a 93% activation success rate.
- The voice cloning: Users can clone voices from a three-second audio sample. The system supports multi-speaker dialogue in a single pass. It handles natural interruptions and emotion changes seamlessly.
- The availability: The entire project is open-source. Developers can access the weights on Hugging Face and the code on GitHub. A live playground is available on the official website.
The Bottom Line: Developers can now run a production-grade text-to-speech engine on local hardware and bypass paid APIs entirely.
If you need on-demand GPUs for training, fine-tuning, inference, or running open-source models, give RunPod a try.
- Available hardware: H100, H200, A100, L40S, RTX 4090, RTX 5090, and 30+ more
- Cost: significantly cheaper than AWS or GCP, billed per second, no contracts
- Setup: spins up in under a minute, 30+ regions worldwide

Get the core business tech news delivered straight to your inbox. We track AI, automation, SaaS, and cybersecurity so you don't have to.
Just read what you want, and be done with it.





