CosyVoice-v3-flash - Qwen Cloud

CosyVoice

Copied!

Add to Compare

Text-to-Speech

Overview

Text-to-Speech

Synthesis Capabilities: CosyVoice-v3-Flash is the latest high-performance speech synthesis model in the CosyVoice series from Tongyi Labs, offering improved naturalness, timbre, prosody, and emotional expressiveness compared to previous versions. This model supports real-time streaming text-to-speech synthesis. Cloning Capabilities: CosyVoice-v3-Flash is also the latest speech cloning model in the CosyVoice series from Tongyi Labs. Compared to previous versions, it improves pronunciation accuracy and timbre similarity, and adds support for more less commonly spoken languages (German, Spanish, French, Italian, Russian, Japanese). It can quickly generate highly similar and naturally sounding custom voices from just 5-20 seconds of reference audio.

Input

Text

Output

Audio

Features

Prefix Completion

Function Calling

Cache

Structured Outputs

Batches

Web Search

Pricing

TTS
$0.13Per 10,000 characters

Rate Limits

RPMRequests Per Minute
180

API Reference

Get API Key

Copied!

12345678910111213141516

# coding=utf-8

import dashscope
from dashscope.audio.tts_v2 import *

# If the API Key is not configured in the environment variable, your-api-key needs to be replaced with your own API Key
# dashscope.api_key = "your-api-key"

model = "cosyvoice-v3-flash"
voice = "longanyang"

synthesizer = SpeechSynthesizer(model=model, voice=voice)
audio = synthesizer.call("今天天气怎么样？")

with open('output.mp3', 'wb') as f:
    f.write(audio)

# coding=utf-8

import dashscope
from dashscope.audio.tts_v2 import *

# If the API Key is not configured in the environment variable, your-api-key needs to be replaced with your own API Key
# dashscope.api_key = "your-api-key"

model = "cosyvoice-v3-flash"
voice = "longanyang"

synthesizer = SpeechSynthesizer(model=model, voice=voice)
audio = synthesizer.call("今天天气怎么样？")

with open('output.mp3', 'wb') as f:
    f.write(audio)