Qwen3.5-Open-Source
Copied!
Try AIAdd to Compare
ReasoningVisual UnderstandingText Generation
Overview
ReasoningVisual UnderstandingText Generation
The Qwen3.5 122B-A10B native vision-language model is built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency. In terms of overall performance, this model is second only to Qwen3.5-397B-A17B. Its text capabilities significantly outperform those of Qwen3-235B-2507, and its visual capabilities surpass those of Qwen3-VL-235B.
Input
TextImageVideo
Output
Text
Features
Prefix Completion
Function Calling
Cache
Structured Outputs
Batches
Web Search
Pricing
- Input$0.4Per 1M tokens
- Output$3.2Per 1M tokens
Context
Standard
Context
262.14K
Max Input
260.09K
Max Output
65.53K
Thinking
Context
262.14K
Max Input
258.04K
Max Output
65.53K
Max Reasoning
81.92K
Rate Limits
- RPMRequests Per Minute600
- TPMTokens Per Minute1M
Built-in Tools
web_searchResponses API
web_extractorResponses API
code_interpreterResponses API
t2i_searchResponses API
i2i_searchResponses API
API Reference
Get API KeyCopied!
123456789101112131415161718