Qwen3-ASR-Flash
Copied!
Speech Recognition
Overview
Speech Recognition
Qwen3-ASR-Flash is a highly accurate, intelligent, and robust multilingual speech recognition model based on a large language model. Leveraging a powerful foundational model, massive amounts of text and multimodal data, and tens of millions of hours of audio data, Qwen3-ASR-Flash achieves high-precision speech recognition. It can automatically determine the language and accurately recognize speech in 11 languages, ensuring precise transcription even in complex audio environments.This version is a snapshot version from September 8, 2025.
Input
Audio
Output
Text
Features
Prefix Completion
Function Calling
Cache
Structured Outputs
Batches
Web Search
Pricing
- Audio Duration $0.000035Per second
Rate Limits
- RPMRequests Per Minute100
API Reference
Get API KeyCopied!
12345678910111213141516171819202122232425262728293031