Qwen3-ASR-Flash-Filetrans
Copied!
Speech Recognition
Overview
Speech Recognition
The large file transcription version of Qwen3-ASR-Flash. Qwen3-ASR-Flash is a highly accurate, intelligent, and robust multilingual speech recognition model based on a large language model. Leveraging a powerful foundational model, massive amounts of text and multimodal data, and tens of millions of hours of audio data, Qwen3-ASR-Flash achieves high-precision speech recognition. It can automatically determine the language and accurately recognize speech in multiple languages, ensuring precise transcription even in complex audio environments.This version is a snapshot version from November 17, 2025.
Input
Audio
Output
Text
Features
Prefix Completion
Function Calling
Cache
Structured Outputs
Batches
Web Search
Pricing
- Audio Duration $0.000035Per second
Rate Limits
- RPMRequests Per Minute100
API Reference
Get API KeyCopied!
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990