Voice-Enrollment

Copied!

Add to Compare

Text-to-Speech

Overview

Text-to-Speech

A large-model voice replication service used in conjunction with Cosyvoice-v3. Utilizing advanced large-model technology for feature extraction, it can replicate voices without a training process. Only a very short audio clip is required to quickly generate a highly similar and natural-sounding custom voice.

Input

Audio

Output

Text

Features

Prefix Completion

Function Calling

Cache

Structured Outputs

Batches

Web Search

Rate Limits

RPMRequests Per Minute
600

API Reference

Get API Key

Copied!

123456789101112

curl -X POST https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "voice-enrollment",
    "input": {
        "action": "create_voice",
        "target_model": "cosyvoice-v3-flash",
        "prefix": "myvoice",
        "url": "https://yourAudioFileUrl"
    }
}'

curl -X POST https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "voice-enrollment",
    "input": {
        "action": "create_voice",
        "target_model": "cosyvoice-v3-flash",
        "prefix": "myvoice",
        "url": "https://yourAudioFileUrl"
    }
}'