Qwen3-ASR-Flash-2025-09-08

Qwen3-ASR-Flash

Copied!

Add to Compare

Speech Recognition

Overview

Speech Recognition

Qwen3-ASR-Flash is a highly accurate, intelligent, and robust multilingual speech recognition model based on a large language model. Leveraging a powerful foundational model, massive amounts of text and multimodal data, and tens of millions of hours of audio data, Qwen3-ASR-Flash achieves high-precision speech recognition. It can automatically determine the language and accurately recognize speech in 11 languages, ensuring precise transcription even in complex audio environments.This version is a snapshot version from September 8, 2025.

Input

Audio

Output

Text

Features

Prefix Completion

Function Calling

Cache

Structured Outputs

Batches

Web Search

Pricing

Audio Duration
$0.000035Per second

Rate Limits

RPMRequests Per Minute
100

API Reference

Get API Key

Copied!

12345678910111213141516171819202122232425262728293031

import os
import dashscope

messages = [
    {
        "role": "system",
        "content": [
            # Configure the context for customized recognition
            {"text": ""},
        ]
    },
    {
        "role": "user",
        "content": [
            {"audio": "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3"},
        ]
    }
]
response = dashscope.MultiModalConversation.call(
    # If the environment variable is not set, replace it with your Model Studio API key: api_key = "sk-xxx"
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    model="qwen3-asr-flash-2025-09-08",
    messages=messages,
    result_format="message",
    asr_options={
        # "language": "zh", # Optional. If you know the language in the audio, provide this parameter to improve recognition accuracy
        "enable_lid":True,
        "enable_itn":False
    }
)
print(response)

import os
import dashscope

messages = [
    {
        "role": "system",
        "content": [
            # Configure the context for customized recognition
            {"text": ""},
        ]
    },
    {
        "role": "user",
        "content": [
            {"audio": "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3"},
        ]
    }
]
response = dashscope.MultiModalConversation.call(
    # If the environment variable is not set, replace it with your Model Studio API key: api_key = "sk-xxx"
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    model="qwen3-asr-flash-2025-09-08",
    messages=messages,
    result_format="message",
    asr_options={
        # "language": "zh", # Optional. If you know the language in the audio, provide this parameter to improve recognition accuracy
        "enable_lid":True,
        "enable_itn":False
    }
)
print(response)