Fun-ASR

Copied!
Add to Compare
Speech Recognition

Overview

Speech Recognition

Fun-ASR is a next-generation, end-to-end speech recognition model launched by Tongyi Labs. Based on leading proprietary speech technology, it boasts exceptional contextual awareness and high-precision speech transcription capabilities. Built on an end-to-end architecture, Fun-ASR integrates innovative RAG technology, supporting multi-dimensional features such as large-scale hotword customization, automatic filtering of sensitive and modal particles, ITN normalization, and punctuation prediction, significantly improving overall recognition accuracy and contextual relevance. Furthermore, Fun-ASR supports flexible switching between Chinese and English, covers multiple regional dialects, and boasts enhanced noise robustness, adapting to diverse and complex environments.This version is a snapshot version from August 25th, 2025.

Input

Audio

Output

Text

Features

Prefix Completion

Function Calling

Cache

Structured Outputs

Batches

Web Search

Pricing

  • Audio Duration
    $0.000035Per second

Rate Limits

  • RPMRequests Per Minute
    600

API Reference

Get API Key
Copied!
123456789101112131415161718