Speech

Speech technology refers to the capability of computer systems to process human speech, aiming to achieve speech recognition, synthesis, and understanding. Its goal is to build intelligent systems that can interact efficiently, enhancing user experience. It is widely applied in virtual assistants, customer service systems, voice translation, and other fields, significantly promoting the naturalness and convenience of human-computer interaction.

Arabic Text Diacritization

Speech

Speech Recognition

Speech Separation

Speaker Diarization

Speech Emotion Recognition

Speech Enhancement

Dialogue Generation

Spoken language identification

Speaker Verification

Keyword Spotting

Automatic Speech Recognition (ASR)

Multimodal Emotion Recognition

Bandwidth Extension

Text-To-Speech Synthesis

Automatic Phoneme Recognition

Speech Dereverberation

Spoken Language Understanding

Speech Synthesis

Story Generation

Automatic Lyrics Transcription

Audio-Visual Speech Recognition

Speaker Identification

Accented Speech Recognition

Voice Conversion

Speech-to-Speech Translation

Distant Speech Recognition

Visual Speech Recognition

Noisy Speech Recognition

Speech Denoising

Arabic Text Diacritization

Speech Synthesis - Gujarati

Speech Extraction

Cultural Vocal Bursts Intensity Prediction

Acoustic Unit Discovery

Vocal Bursts Type Prediction

Speaker Recognition

Lip to Speech Synthesis

Audio Deepfake Detection

Spoken Command Recognition

Phone-level pronunciation scoring

Word-level pronunciation scoring

A-VB High

Utterance-level pronounciation scoring

Voice Query Recognition

A-VB Culture

A-VB Two

Speech Synthesis - Assamese

Speech Synthesis - Bengali

Speech Synthesis - Bodo

Speech Synthesis - Hindi

Speech Synthesis - Kannada

Speech Synthesis - Malayalam

Speech Synthesis - Manipuri

Speech Synthesis - Marathi

Speech Synthesis - Rajasthani

Speech Synthesis - Tamil

Speech Synthesis - Telugu