Qwen3-TTS-VC-Realtime

Copied!
Add to Compare

Overview

Qwen3-TTS-Flash model is Tongyi's latest real-time speech synthesis model. It can perform high-fidelity real-time speech synthesis on voices replicated by the qwen3-voice-enrollment service, and supports speech output in 11 languages ​​with the same voice timbre. This model has been trained on massive amounts of data, and the synthesized audio can adaptively adjust the tone according to the text, and it also has good processing capabilities for complex text synthesis.This model is provided as a snapshot version.

Input

Text

Output

Audio

Features

Prefix Completion

Function Calling

Cache

Structured Outputs

Batches

Web Search

Pricing

  • TTS
    $0.13Per 10,000 characters

Rate Limits

  • RPMRequests Per Minute
    180

API Reference

Get API Key
Copied!
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142