Fun-ASR
Overview
Fun-ASR is a next-generation, end-to-end speech recognition model launched by Tongyi Labs. Based on leading proprietary speech technology, it boasts exceptional contextual awareness and high-precision speech transcription capabilities. Built on an end-to-end architecture, Fun-ASR integrates innovative RAG technology, supporting multi-dimensional features such as large-scale hotword customization, automatic filtering of sensitive and modal particles, ITN normalization, and punctuation prediction, significantly improving overall recognition accuracy and contextual relevance. Furthermore, Fun-ASR supports flexible switching between Chinese and English, covers multiple regional dialects, and boasts enhanced noise robustness, adapting to diverse and complex environments.This version is a snapshot version from August 25th, 2025.
Input
Output
Features
Prefix Completion
Function Calling
Cache
Structured Outputs
Batches
Web Search
Pricing
- Audio Duration $0.000035Per second
Rate Limits
- RPMRequests Per Minute600