Qwen3.5-Flash
Copied!
Try AIAdd to Compare
Visual UnderstandingText GenerationReasoning
Overview
Visual UnderstandingText GenerationReasoning
The Qwen3.5 native vision-language Flash models are built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency. Compared to the 3 series, these models deliver a leap forward in performance for both pure text and multimodal tasks, offering fast response times while balancing inference speed and overall performance.This version is a snapshot as of February 23, 2026.
Input
TextImageVideo
Output
Text
Features
Prefix Completion
Function Calling
Cache
Structured Outputs
Batches
Web Search
Pricing
- Input$0.1Per 1M tokens
- Output$0.4Per 1M tokens
Context
Context
1M
Max Input
991.80K
Max Output
65.53K
Rate Limits
- RPMRequests Per Minute60
- TPMTokens Per Minute1M
Built-in Tools
web_searchResponses API
web_extractorResponses API
code_interpreterResponses API
t2i_searchResponses API
i2i_searchResponses API
API Reference
Get API KeyCopied!
123456789101112131415161718