Qwen3.5-Open-Source

Copied!
Try AIAdd to Compare
ReasoningVisual UnderstandingText Generation

Overview

ReasoningVisual UnderstandingText Generation

The Qwen3.5 series 397B-A17B native vision-language model is built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency. It delivers state-of-the-art performance comparable to leading-edge models across a wide range of tasks, including language understanding, logical reasoning, code generation, agent-based tasks, image understanding, video understanding, and graphical user interface (GUI) interactions. With its robust code-generation and agent capabilities, the model exhibits strong generalization across diverse agent.

Input

TextImageVideo

Output

Text

Features

Prefix Completion

Function Calling

Cache

Structured Outputs

Batches

Web Search

Pricing

  • Input
    $0.6Per 1M tokens
  • Output
    $3.6Per 1M tokens

Context

Standard

Context
262.14K
Max Input
260.09K
Max Output
65.53K

Thinking

Context
262.14K
Max Input
258.04K
Max Output
65.53K
Max Reasoning
81.92K

Rate Limits

  • RPMRequests Per Minute
    600
  • TPMTokens Per Minute
    1M

Built-in Tools

web_searchResponses API
web_extractorResponses API
code_interpreterResponses API
t2i_searchResponses API
i2i_searchResponses API

API Reference

Get API Key
Copied!
123456789101112131415161718