CosyVoice

Copied!
Add to Compare
Text-to-Speech

Overview

Text-to-Speech

Cloning capability: CosyVoice-v3-plus is the latest large voice cloning model in the CosyVoice series from Tongyi Lab. It offers superior sound quality and cloning fidelity, ideal for professional scenarios. With just 5-20 seconds of reference audio, it can rapidly generate a highly similar and natural-sounding custom voice. Synthesis capability: CosyVoice-v3-plus is the latest large speech synthesis model in the CosyVoice series from Tongyi Lab. It features enhanced sound quality and expressiveness, ideal for professional scenarios. The model supports real-time, streaming text-to-speech synthesis.

Input

Text

Output

Audio

Features

Prefix Completion

Function Calling

Cache

Structured Outputs

Batches

Web Search

Pricing

  • TTS
    $0.26Per 10,000 characters

Rate Limits

  • RPMRequests Per Minute
    180

API Reference

Get API Key
Copied!
12345678910111213141516
# coding=utf-8

import dashscope
from dashscope.audio.tts_v2 import *

# If the API Key is not configured in the environment variable, your-api-key needs to be replaced with your own API Key
# dashscope.api_key = "your-api-key"

model = "cosyvoice-v3-plus"
voice = "longanyang"

synthesizer = SpeechSynthesizer(model=model, voice=voice)
audio = synthesizer.call("今天天气怎么样?")

with open('output.mp3', 'wb') as f:
    f.write(audio)