Qwen3.5-LiveTranslate-Flash-Realtime
Copied!
Real-time Speech Translation
Overview
Real-time Speech Translation
The real-time version of Qwen3.5-LiveTranslate-Flash, which is a high-precision, highly responsive, and robust multilingual simultaneous audio and video interpretation model. Leveraging Qwen3.5-Omni's powerful infrastructure, massive multimodal data, cross-language and cross-modal alignment, and visual enhancement technologies, Qwen3.5-LiveTranslate-Flash offers both offline and real-time audio and video translation capabilities. It can understand 60 languages and speak 29 languages.
Input
AudioImage
Output
AudioText
Features
Prefix Completion
Function Calling
Cache
Structured Outputs
Batches
Web Search
Pricing
- Input: Audio$7.5Per 1M tokens
- Input: Image$0.55Per 1M tokens
- Output: Text$20Per 1M tokens
- Output: Audio$30Per 1M tokens
Context
Context
53.24K
Max Input
49.15K
Max Output
4.09K
Rate Limits
- RPMRequests Per Minute10
- TPMTokens Per Minute100K
API Reference
Get API KeyCopied!
1234567891011121314151617181920212223242526272829