Qwen3-ASR-Flash

Copied!
Add to Compare
Speech Recognition

Overview

Speech Recognition

Qwen3-ASR-Flash is a highly accurate, intelligent, and robust multilingual speech recognition model based on a large language model. Leveraging a powerful foundational model, massive amounts of text and multimodal data, and tens of millions of hours of audio data, Qwen3-ASR-Flash achieves high-precision speech recognition. It can automatically determine the language and accurately recognize speech in 11 languages, ensuring precise transcription even in complex audio environments.This version is a snapshot version from September 8, 2025.

Input

Audio

Output

Text

Features

Prefix Completion

Function Calling

Cache

Structured Outputs

Batches

Web Search

Pricing

  • Audio Duration
    $0.000035Per second

Rate Limits

  • RPMRequests Per Minute
    100

API Reference

Get API Key
Copied!
12345678910111213141516171819202122232425262728293031