Qwen3-Open-Source

Copied!
Try AIAdd to Compare
Visual Understanding

Overview

Visual Understanding

The "Thinking" edition of Qwen3-VL 8B Dense has a reduced memory footprint, enabling multimodal understanding and reasoning. It supports ultra-long contexts (e.g., long videos and documents), 2D/3D visual localization, and enhances image/video comprehension, spatial perception, and object recognition.

Input

TextImageVideo

Output

Text

Features

Prefix Completion

Function Calling

Cache

Structured Outputs

Batches

Web Search

Pricing

  • Input(Thinking)
    $0.18Per 1M tokens
  • Output(Thinking)
    $2.1Per 1M tokens

Context

Context
131.07K
Max Input
126.97K
Max Output
32.76K

Rate Limits

  • RPMRequests Per Minute
    60
  • TPMTokens Per Minute
    100K

API Reference

Get API Key
Copied!
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263