Machine learningSource separation and demixing

人声分离

人声分离是指从混合音乐录音中分离出歌唱人声，保留器乐伴奏的任务。该任务由 Han 等人 (2012) 正式提出，对于音乐编辑、混音、卡拉OK生成和音乐分析至关重要。现代深度学习方法 (Défossez et al., 2021) 已实现令人印象深刻的质量，并在音乐制作和流媒体服务中实现了实际应用。人声分离是声源分离的一个特例，其目标是分离出感知上最显著的声源。

在 MethodMind 中打开即将推出Apply, compare, get guidance

Tools & resources

下载幻灯片

Learn & explore

视频即将推出

阅读完整方法

仅限会员

使用免费账户登录即可阅读本节。

方法图谱

相关方法的邻域——选择一个节点以展开探索。

人声分离

自动音乐转录节拍跟踪旋律提取音乐分段音高检测算法音色分析

来源

Han, Y., Qin, Z., & Kang, Z. (2012). Singing voice separation using spectral floor filtered spectrograms. In Proceedings of the International Society for Music Information Retrieval Conference. link ↗
Huang, P. S., Kim, M., Hasegawa-Johnson, M., & Smaragdis, P. (2015). Joint optimization of masks and deep recurrent neural networks for monaural source separation. IEEE Transactions on Audio, Speech, and Language Processing, 23(12), 2136-2147. DOI: 10.1109/taslp.2015.2468583 ↗
Défossez, A., Usunier, N., Bottou, L., & Bach, F. (2021). Music source separation in the waveform domain. In International Conference on Learning Representations. link ↗

如何引用本页

ScholarGate. (2026, June 3). Vocal Separation and Source Separation Algorithm. ScholarGate. https://scholargate.app/zh/music-information-retrieval/vocal-separation