Spoken language identification

Spoken language identification is a subtask in the field of speech processing that focuses on automatically recognizing the language being used from audio input. The task aims to accurately determine the specific language spoken by analyzing the acoustic features of speech signals, thereby providing fundamental support for speech recognition, translation, and interaction in multilingual environments. Its application value lies in enhancing the efficiency and accuracy of cross-language communication, promoting global exchange.

LRE07

VoxForge European

YouTube News dataset (No Noise)

Inception-v3 CRNN

YouTube News dataset (White Noise)

Inception-v3 CRNN

Untranscribed mixed-speech dataset

SVM

VoxForge Commonwealth

YouTube News dataset (Crackling Noise)

Inception-v3 CRNN

YouTube News dataset (Background Music)