HyperAI

Speech Recognition

Speech recognition is the task of converting spoken language into text, involving the identification of words from audio recordings and transcribing them into written format. Its goal is to accurately transcribe audio content in real-time or recorded audio while considering factors such as accents, speaking rate, and background noise to improve the accuracy and reliability of the transcription. This technology has significant application value in areas like human-computer interaction, automatic subtitle generation, and voice assistants.

LibriSpeech test-clean
HuBERT with Libri-Light
LibriSpeech test-other
wav2vec 2.0 with Libri-Light
Switchboard + Hub500
TIMIT
wav2vec 2.0
AISHELL-1
Qwen-Audio
WSJ eval92
Common Voice German
wav2vec 2.0 XLS-R 1B + TEVR (5-gram)
swb_hub_500 WER fullSWBCH
TUDA
QuartzNet15x5DE (D37)
Common Voice French
ConformerCTC-L (5-gram)
Common Voice Spanish
ConformerCTC-L (4-gram)
MediaSpeech
Quartznet
SLUE
W2V2-B-VP100K
VietMed
WenetSpeech
Paraformer-large
EasyCom
GigaSpeech DEV
SAMBA ASR
GigaSpeech TEST
Zipformer+pruned transducer w/ CR-CTC (no external language model)
Hub5'00 SwitchBoard
LAS + SpecAugment (with LM, Switchboard mild policy)
Libri-Light test-clean
CPC unlab-60k
Libri-Light test-other
CPC unlab-60k
CHiME-6 dev_gss12
LRS3-TED
Whisper
Tedlium
WSJ dev93
CTC-CRF ST-NAS
CHiME-6 eval
Common Voice vi
Vietnamese end-to-end speech recognition using wav2vec 2.0 by VietAI
Europarl-ASR EN Guest-test
Fongbe audio
Triphone (39 features) + LDA and MLLT + SGMM
Speech Commands
Centaurus
SPGISpeech
VIVOS
khanhld/chunkformer-large-vie
WSJ eval93
Deep Speech 2
AISHELL-2
AMI IMH
AMI SDM1
Common Voice
Common Voice English
Whisper (Large v2)
Common Voice Italian
Whisper (Large v2)
Europarl-ASR EN MEP-test
LibriCSS
TS-SEP
TED-LIUM
Whisper-LLaMa-7b
AISHELL-2 Test Android
Qwen-Audio
AISHELL-2 Test IOS
AISHELL-2 Test Mic
CALLHOME En
WavLM Large & EEND-vector clustering
CALLHOME Spanish Speech
CAS-VSR-S101
Common Voice Frisian
Common Voice Japanese
Common Voice Portuguese
XLSR53 Wav2Vec2 Portuguese by Orlem Santos
Common Voice Russian
Whisper (Large v2)
facebook/multilingual_librispeech german
TDT 0-4
GigaSpeech
Conformer/Transformer-AED
Google Speech Commands - Musan
Hub5'00 FISHER-SWBD
CTC-CRF
Hub5'00 CallHome
Espresso
LibriSpeech 100h test-clean
LibriSpeech 100h test-other
Branchformer + GFSA
LibriSpeech train-clean-100 test-clean
wav2vec_wav2letter
LibriSpeech train-clean-100 test-other
wav2vec_wav2letter
LRS2
RAVEn Large
Switchboard (300hr)
Switchboard CallHome
Switchboard SWBD
AISHELL-2 Android
AISHELL-2 Mic
ATCOSIM corpus (Air Traffic Control Communications)
ATCOSIM dataset (Air Traffic Control Communications)
Common Voice 7.0 Abkhaz
Common Voice 7.0 Arabic
Common Voice 7.0 Bashkir
Common Voice 7.0 German
Common Voice 7.0 Hindi
Common Voice 7.0 Odia
Common Voice 7.0 Portuguese
Common Voice 7.0 Votic
Common Voice 8.0 Assamese
Common Voice 8.0 Basaa
Common Voice 8.0 Breton
Common Voice 8.0 Bulgarian
Common Voice 8.0 Central Kurdish
Common Voice 8.0 Dutch
Common Voice 8.0 Erzya
Common Voice 8.0 French
Common Voice 8.0 Galician
Common Voice 8.0 German
Common Voice 8.0 Guarani
Common Voice 8.0 Hausa
Common Voice 8.0 Hindi
Common Voice 8.0 Hungarian
Common Voice 8.0 Japanese
Common Voice 8.0 Kabyle
Common Voice 8.0 Kazakh
Common Voice 8.0 Kurmanji Kurdish
Common Voice 8.0 Maltese
Common Voice 8.0 Marathi
Common Voice 8.0 Odia
Common Voice 8.0 Portuguese
Common Voice 8.0 Punjabi
Common Voice 8.0 Romansh Sursilvan
Common Voice 8.0 Romansh Vallader
Common Voice 8.0 Russian
Common Voice 8.0 Santali (Ol Chiki)
Common Voice 8.0 Serbian
Common Voice 8.0 Slovenian
Common Voice 8.0 Sorbian, Upper
Common Voice 8.0 Swahili
Common Voice 8.0 Tatar
Common Voice 8.0 Uzbek
Common Voice 8.0 Votic
Common Voice Arabic
Common Voice Breton
Common Voice Catalan
Common Voice Chinese (China)
Common Voice Czech
Common Voice Dutch
Common Voice Georgian
Common Voice Hindi
Common Voice Indonesian
Common Voice Lithuanian
Common Voice Maltese
Common Voice Odia
Common Voice Persian
Common Voice Polish
Common Voice Swedish
Common Voice Tamil
Common Voice Turkish
Common Voice Vietnamese
Common Voice Welsh
CORAA
FLEURS
fon
German ASR Data-Mix
Kazakh Speech Corpus v1.1
MLS
Mozilla Common Voice 15.0 Persian
Mozilla Common Voice 16.1
Mozilla Common Voice 9.0
Podlodka.io
projecte-aina/parlament_parla ca
Reazonspeech
Robust Speech Event - Catalan Dev Data
Robust Speech Event - Dev Data
Russian LibriSpeech
SPGI Speech
tedlium-v3
UWB-ATCC dataset (Air Traffic Control Communications)