Qwen3.5-Omni-Flash-2026-03-15

Qwen3.5-Omni-Flash

Copied!

Add to Compare

Multimodal

Overview

Multimodal

Qwen 3.5-Omni is the latest generation of Qwen's multimodal large model, supporting text, image, audio, and audio-visual understanding and interaction. As a comprehensive evolution of Qwen 3-Omni, it supports over 10 hours of audio understanding and over 400 seconds of 720P (1 FPS) audio-visual understanding and dialogue. It further expands the language range, supporting audio input in 60+ languages and speech output in 30+ languages. It also possesses powerful structured audio-visual understanding capabilities and is widely used in text creation, voice assistants, multimedia analysis, and other scenarios, providing a natural and fluent multimodal understanding and interactive experience.This version is a snapshot from March 15, 2026.

Input

TextImageVideoAudio

Output

TextAudio

Features

Prefix Completion

Function Calling

Cache

Structured Outputs

Batches

Web Search

Pricing

Input: Audio
$3Per 1M tokens
Output: Text&Audio (Output text is not charged)
$11.9Per 1M tokens
input：Text/Image/Video
$0.4Per 1M tokens
Output: Text
$2.2Per 1M tokens

Context

262.14K

Max Input

196.60K

Max Output

65.53K

Rate Limits

RPMRequests Per Minute
60
TPMTokens Per Minute
100K

Built-in Tools

search_strategy:agentCompletions API

API Reference

Get API Key

Copied!

12345678910111213141516171819202122232425

import os
from openai import OpenAI

client = OpenAI(
    # The API keys for the Singapore and Beijing regions are different. To obtain an API key, see: https://www.alibabacloud.com/help/en/model-studio/get-api-key
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1",
)

completion = client.chat.completions.create(
    model="qwen3.5-omni-flash-2026-03-15",
    messages=[{"role": "user", "content": "Who are you"}],
    # Set the modality for the output data. The following modalities are supported: ["text","audio"]、["text"]
    modalities=["text", "audio"],
    audio={"voice": "Ethan", "format": "wav"},
    # The stream parameter must be set to True. Otherwise, an error is reported
    stream=True,
    stream_options={"include_usage": True},
)

for chunk in completion:
    if chunk.choices:
        print(chunk.choices[0].delta)
    else:
        print(chunk.usage)

import os
from openai import OpenAI

client = OpenAI(
    # The API keys for the Singapore and Beijing regions are different. To obtain an API key, see: https://www.alibabacloud.com/help/en/model-studio/get-api-key
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1",
)

completion = client.chat.completions.create(
    model="qwen3.5-omni-flash-2026-03-15",
    messages=[{"role": "user", "content": "Who are you"}],
    # Set the modality for the output data. The following modalities are supported: ["text","audio"]、["text"]
    modalities=["text", "audio"],
    audio={"voice": "Ethan", "format": "wav"},
    # The stream parameter must be set to True. Otherwise, an error is reported
    stream=True,
    stream_options={"include_usage": True},
)

for chunk in completion:
    if chunk.choices:
        print(chunk.choices[0].delta)
    else:
        print(chunk.usage)