Qwen3-TTS-Instruct-Flash

Copied!

Try AIAdd to Compare

Text-to-Speech

Overview

Text-to-Speech

Qwen3-TTS-Flash model is Tongyi's latest real-time speech synthesis model. The Instruct model processes the synthesis effect through natural language, ensuring highly appropriate emotional and expressive speech in different contexts. Currently, it supports 25 timbres for both Chinese and English Instruct adjustments.

Input

Text

Output

Audio

Features

Prefix Completion

Function Calling

Cache

Structured Outputs

Batches

Web Search

Pricing

TTS
$0.115Per 10,000 characters

Rate Limits

RPMRequests Per Minute
180

API Reference

Get API Key

Copied!

1234567891011121314151617

import os
import dashscope

dashscope.base_http_api_url = 'https://dashscope-intl.aliyuncs.com/api/v1'

text = "Dear listeners, hello everyone. Welcome to the evening news."

response = dashscope.MultiModalConversation.call(
    model="qwen3-tts-instruct-flash",
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    text=text,
    voice="Cherry",
    instructions='The speaking speed is fast and there is a distinct upward inflection, which is suitable for introducing fashionable products.',
    optimize_instructions=True,
    stream=False
)
print(response)