Qwen3-TTS-Flash
Copied!
Try AIAdd to Compare
Text-to-Speech
Overview
Text-to-Speech
Qwen3-TTS-Flash model is Tongyi's latest offline speech synthesis model. It boasts 51 highly expressive human-like timbres and can synthesize audio with low latency and high stability. It also supports multiple languages and dialects, and allows for multilingual output using the same timbre. Trained on massive amounts of data, the model can adaptively adjust tone based on the text and handles complex text synthesis effectively.This model is provided as a snapshot version.
Input
Text
Output
Audio
Features
Prefix Completion
Function Calling
Cache
Structured Outputs
Batches
Web Search
Pricing
- TTS$0.1Per 10,000 characters
Rate Limits
- RPMRequests Per Minute180
API Reference
Get API KeyCopied!
1234567891011121314