Qwen-Voice-Enrollment
Copied!
Try AIAdd to Compare
Overview
The Qwen Voice-Enrollment model is a series of voice replication models from the Qianwen speech model. It can quickly replicate highly similar voices using audio of only 5 seconds or more. When used in conjunction with the qwen3-tts-vc-realtime model, it can replicate a person's voice with high fidelity and output speech in 10 languages. Furthermore, the synthesized audio can adaptively adjust its tone according to the text and has good processing capabilities for complex text synthesis.
Input
Audio
Output
Text
Features
Prefix Completion
Function Calling
Cache
Structured Outputs
Batches
Web Search
Pricing
- TTS$0.01Per voice
Rate Limits
- RPMRequests Per Minute180