Qwen-VL-Max

Copied!

Try AIAdd to Compare

Visual Understanding

Overview

Visual Understanding

Qwen's Most Capable Large Visual Language Model. Compared to the enhanced version, further improvements have been made to visual reasoning and instruction-following capabilities, offering a higher level of visual perception and cognitive understanding. It delivers optimal performance on an even broader range of complex tasks.

Input

TextImageVideo

Output

Text

Features

Prefix Completion

Function Calling

Cache

Structured Outputs

Batches

Web Search

Pricing

Input
$0.8Per 1M tokens
Output
$3.2Per 1M tokens
Input(Implicit Cache)
$0.16Per 1M tokens
Input(Batch File)
$0.4Per 1M tokens
Output(Batch File)
$1.6Per 1M tokens

Context

131.07K

Max Input

129.02K

Max Output

32.76K

Rate Limits

RPMRequests Per Minute
1.20K
TPMTokens Per Minute
1M

API Reference

Get API Key

Copied!

1234567891011121314151617

import os
import dashscope
dashscope.base_http_api_url = 'https://dashscope-intl.aliyuncs.com/api/v1'
messages = [
{
    "role": "user",
    "content": [
    {"image": "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20241108/ctdzex/biaozhun.jpg"},
    {"text": "Output the text in the image only."}]
}]
response = dashscope.MultiModalConversation.call(
    #If the environment variable is not set, replace it with your Model Studio API key:  api_key ="sk-xxx"
    api_key = os.getenv('DASHSCOPE_API_KEY'),
    model = 'qwen-vl-max',
    messages = messages
)
print(response.output.choices[0].message.content[0]["text"])

import os
import dashscope
dashscope.base_http_api_url = 'https://dashscope-intl.aliyuncs.com/api/v1'
messages = [
{
    "role": "user",
    "content": [
    {"image": "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20241108/ctdzex/biaozhun.jpg"},
    {"text": "Output the text in the image only."}]
}]
response = dashscope.MultiModalConversation.call(
    #If the environment variable is not set, replace it with your Model Studio API key:  api_key ="sk-xxx"
    api_key = os.getenv('DASHSCOPE_API_KEY'),
    model = 'qwen-vl-max',
    messages = messages
)
print(response.output.choices[0].message.content[0]["text"])