Qwen3.5-LiveTranslate-Flash-Realtime

Copied!
Add to Compare
Real-time Speech Translation

Overview

Real-time Speech Translation

The real-time version of Qwen3.5-LiveTranslate-Flash, which is a high-precision, highly responsive, and robust multilingual simultaneous audio and video interpretation model. Leveraging Qwen3.5-Omni's powerful infrastructure, massive multimodal data, cross-language and cross-modal alignment, and visual enhancement technologies, Qwen3.5-LiveTranslate-Flash offers both offline and real-time audio and video translation capabilities. It can understand 60 languages ​​and speak 29 languages.

Input

AudioImage

Output

AudioText

Features

Prefix Completion

Function Calling

Cache

Structured Outputs

Batches

Web Search

Pricing

  • Input: Audio
    $7.5Per 1M tokens
  • Input: Image
    $0.55Per 1M tokens
  • Output: Text
    $20Per 1M tokens
  • Output: Audio
    $30Per 1M tokens

Context

Context
53.24K
Max Input
49.15K
Max Output
4.09K

Rate Limits

  • RPMRequests Per Minute
    10
  • TPMTokens Per Minute
    100K

API Reference

Get API Key
Copied!
1234567891011121314151617181920212223242526272829