Qwen-VL-Plus
Copied!
Try AIAdd to Compare
Visual Understanding
Overview
Visual Understanding
Qwen's Enhanced Large Visual Language Model. Significantly upgraded for detailed recognition capabilities and text recognition abilities, supporting ultra-high pixel resolutions up to millions of pixels and extreme aspect ratios for image input. It delivers significant performance across a broad range of visual tasks.
Input
TextImageVideo
Output
Text
Features
Prefix Completion
Function Calling
Cache
Structured Outputs
Batches
Web Search
Pricing
- Input$0.21Per 1M tokens
- Output$0.63Per 1M tokens
- Input(Implicit Cache)$0.042Per 1M tokens
- Input(Batch File)$0.105Per 1M tokens
- Output(Batch File)$0.315Per 1M tokens
Context
Context
131.07K
Max Input
129.02K
Max Output
8.19K
Rate Limits
- RPMRequests Per Minute1.20K
- TPMTokens Per Minute1M
API Reference
Get API KeyCopied!
1234567891011121314151617