Qwen3-Open-Source
Copied!
Try AIAdd to Compare
Visual Understanding
Overview
Visual Understanding
The "Thinking" edition of Qwen3-VL 8B Dense has a reduced memory footprint, enabling multimodal understanding and reasoning. It supports ultra-long contexts (e.g., long videos and documents), 2D/3D visual localization, and enhances image/video comprehension, spatial perception, and object recognition.
Input
TextImageVideo
Output
Text
Features
Prefix Completion
Function Calling
Cache
Structured Outputs
Batches
Web Search
Pricing
- Input(Thinking)$0.18Per 1M tokens
- Output(Thinking)$2.1Per 1M tokens
Context
Context
131.07K
Max Input
126.97K
Max Output
32.76K
Rate Limits
- RPMRequests Per Minute60
- TPMTokens Per Minute100K
API Reference
Get API KeyCopied!
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263