Qwen-VL-OCR

Copied!
Try AIAdd to Compare
Visual Understanding

Overview

Visual Understanding

Qwen-VL-OCR is a large OCR recognition model built upon Qwen-VL. It unifies a wide range of image-text recognition, parsing, and processing tasks within a single model, delivering robust visual-text comprehension.

Input

Image

Output

Text

Features

Prefix Completion

Function Calling

Cache

Structured Outputs

Batches

Web Search

Pricing

  • Input
    $0.07Per 1M tokens
  • Output
    $0.16Per 1M tokens

Context

Context
38.19K
Max Input
30K
Max Output
8.19K

Rate Limits

  • RPMRequests Per Minute
    600
  • TPMTokens Per Minute
    6M

API Reference

Get API Key
Copied!
1234567891011121314151617