Fun-ASR-Realtime

Copied!

Add to Compare

Real-time Speech Recognition

Overview

Real-time Speech Recognition

This is the real-time version of Tongyi Lab's next-generation end-to-end speech recognition model, based on leading proprietary speech technology, and boasts exceptional contextual awareness and high-precision speech transcription capabilities. Based on an end-to-end architecture, Fun-ASR integrates innovative RAG technology, supporting multi-dimensional features such as large-scale hotword customization, automatic filtering of sensitive and modal particles, ITN normalization, and punctuation prediction, significantly improving overall recognition accuracy and contextual relevance. Furthermore, Fun-ASR supports flexible switching between Chinese and English, covers multiple regional dialects, and boasts enhanced noise robustness, adapting to diverse and complex environments.

Fun-ASR-Realtime

Overview

Input

Output

Features

Prefix Completion

Function Calling

Cache

Structured Outputs

Batches

Web Search

Rate Limits