Qwen-VL-Plus

Copied!
Try AIAdd to Compare
Visual Understanding

Overview

Visual Understanding

Qwen's Enhanced Large Visual Language Model. Significantly upgraded for detailed recognition capabilities and text recognition abilities, supporting ultra-high pixel resolutions up to millions of pixels and extreme aspect ratios for image input. It delivers significant performance across a broad range of visual tasks.

Input

TextImageVideo

Output

Text

Features

Prefix Completion

Function Calling

Cache

Structured Outputs

Batches

Web Search

Pricing

  • Input
    $0.21Per 1M tokens
  • Output
    $0.63Per 1M tokens
  • Input(Implicit Cache)
    $0.042Per 1M tokens
  • Input(Batch File)
    $0.105Per 1M tokens
  • Output(Batch File)
    $0.315Per 1M tokens

Context

Context
131.07K
Max Input
129.02K
Max Output
8.19K

Rate Limits

  • RPMRequests Per Minute
    1.20K
  • TPMTokens Per Minute
    1M

API Reference

Get API Key
Copied!
1234567891011121314151617