语音分离
语音分离(Speech Separation)是指从混合语音信号中提取所有重叠的语音源的任务。作为声源分离问题的一个特殊场景,语音分离主要关注于分离出多个同时发声的语音信号,而非音乐或噪声等其他干扰信号。该技术在多说话人环境下的语音识别、听力辅助设备及音频编辑等领域具有重要应用价值。
WSJ0-2mix
SepReformer-L
WHAMR!
TF-Locoformer (M)
Libri2Mix
MossFormer2 (w speed perturb)
WSJ0-3mix
Gated DualPathRNN
LRS2
TDFNet-small
WHAM!
MossFormer2
WSJ0-5mix
Gated DualPathRNN
LRS3
IIANet
VoxCeleb2
RTFS-Net-4
WSJ0-4mix
Libri5Mix
Hungarian PIT
Libri10Mix
GRID corpus (mixed-speech)
Libri20Mix
LibriCSS
Conformer (large)
iKala
U-Net
Libri15Mix
Hungarian PIT
TCD-TIMIT corpus (mixed-speech)
WSJ0-2mix-16k
MossFormer2