Qwen-VL-OCR-2025-11-20

Qwen-VL-OCR

Copied!

Try AIAdd to Compare

Visual Understanding

Overview

Visual Understanding

This model is a snapshot version from November 20, 2025, and is based on the latest Qwen-VL3 architecture with a comprehensive upgrade. It features significant improvements in document parsing and text localization capabilities, as well as substantial reductions in end-to-end latency and illusions.

Input

Image

Output

Text

Features

Prefix Completion

Function Calling

Cache

Structured Outputs

Batches

Web Search

Pricing

Input
$0.07Per 1M tokens
Output
$0.16Per 1M tokens

Context

38.19K

Max Input

30K

Max Output

8.19K

Rate Limits

RPMRequests Per Minute
1.20K
TPMTokens Per Minute
6M

API Reference

Get API Key

Copied!

1234567891011121314151617

import os
import dashscope
dashscope.base_http_api_url = 'https://dashscope-intl.aliyuncs.com/api/v1'
messages = [
{
    "role": "user",
    "content": [
    {"image": "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20241108/ctdzex/biaozhun.jpg"},
    {"text": "Output the text in the image only."}]
}]
response = dashscope.MultiModalConversation.call(
    #If the environment variable is not set, replace it with your Model Studio API key:  api_key ="sk-xxx"
    api_key = os.getenv('DASHSCOPE_API_KEY'),
    model = 'qwen-vl-ocr-2025-11-20',
    messages = messages
)
print(response.output.choices[0].message.content[0]["text"])

import os
import dashscope
dashscope.base_http_api_url = 'https://dashscope-intl.aliyuncs.com/api/v1'
messages = [
{
    "role": "user",
    "content": [
    {"image": "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20241108/ctdzex/biaozhun.jpg"},
    {"text": "Output the text in the image only."}]
}]
response = dashscope.MultiModalConversation.call(
    #If the environment variable is not set, replace it with your Model Studio API key:  api_key ="sk-xxx"
    api_key = os.getenv('DASHSCOPE_API_KEY'),
    model = 'qwen-vl-ocr-2025-11-20',
    messages = messages
)
print(response.output.choices[0].message.content[0]["text"])