当前位置：首页 > 软件库 > 神经网络/人工智能 > 机器学习/深度学习 >

deeplearningsourceseparation

授权协议 View license

开发语言 Python

所属分类神经网络/人工智能、机器学习/深度学习

软件类型开源软件

地区不详

投递者曹伟泽

操作系统跨平台

开源组织无

适用人群未知

软件概览

Deep Learning For Monaural Source Separation

Demo

Webpage: https://sites.google.com/site/deeplearningsourceseparation/

Experiments

MIR-1K experiment (singing voice separation)

Training code: codes/mir1k/train_mir1k_demo.m
Demo

Download a trained model http://www.ifp.illinois.edu/~huang146/DNN_separation/model_400.mat
Put the model at codes/mir1k/demo and go to the folder
Run: codes/mir1k/demo/run_test_single_model.m

TIMIT experiment (speech separation)

Training code: codes/timit/train_timit_demo.m and codes/timit/train_timit_demo_mini_clip.m
Demo

Download a trained model http://www.ifp.illinois.edu/~huang146/DNN_separation/timit_model_70.mat
Put the model at codes/timit/demo and go to the folder
Run: codes/timit/demo/run_test_single_model.m

TSP experiment (speech separation)

Training code: codes/TSP/train_TSP_demo_mini_clip.m
Demo

Download a trained model http://www.ifp.illinois.edu/~huang146/DNN_separation/TSP_model_RNN1_win1_h300_l2_r0_64ms_1000000_softabs_linearout_RELU_logmel_trn0_c1e-10_c0.001_bsz100000_miter10_bf50_c0_d0_7650.mat
Put the model at codes/TSP/demo and go to the folder
Run the demo code at codes/TSP/demo/run_test_single_model.m

Denosing experiment

Put original FCJF0, FDAW0', FDML0, FECD0, 'FETB0', 'FJSP0', 'FKFB0', 'FMEM0', 'FSAH0', 'FSJK1', 'FSMA0', 'FTBR0', 'FVFB0' 'FVMH0 of the original TIMIT data under codes/denoising/Data/timit/
Training code: codes/denoising/train_denoising_demo.m
Demo

Download a trained model http://www.ifp.illinois.edu/~huang146/DNN_separation/denoising_model_870.mat
Put the model at codes/denoising/demo and go to the folder
Run the demo code at codes/denoising/demo/run_test_single_model.m

Dependencies

The package is modified based on rnn-speech-denoising
The software depends on Mark Schmidt's minFunc package for convex optimization.
Additionally, we have included Mark Hasegawa-Johnson's HTK write and read functionsthat are used to handle the MFCC files.
We use HTK for computing features (MFCC, logmel) (HCopy).
We use signal processing functions from labrosa.
We use BSS Eval toolbox Version 2.0, 3.0 for evaluation.
We use MIR-1K for singing voice separation task.
We use TSP for speech separation task.

Work on your data:

To try the codes on your data, see mir1k, TSP settings - put your data into codes/mir1k/Wavfile or codes/TSP/Data/ accordingly.
Look at the unit test parameters below codes/mir1k/train_mir1k_demo.m, codes/TSP/train_TSP_demo_mini_clip.m (with minibatch lbfgs, gradient clipping)
Tune the parameters on the dev set and check the results.

Reference

P.-S. Huang, M. Kim, M. Hasegawa-Johnson, P. Smaragdis, "Joint Optimization of Masks and Deep Recurrent Neural Networks for Monaural Source Separation", IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 23, no. 12, pp. 2136–2147, Dec. 2015
P.-S. Huang, M. Kim, M. Hasegawa-Johnson, P. Smaragdis, "Singing-Voice Separation From Monaural Recordings Using Deep Recurrent Neural Networks," in International Society for Music Information Retrieval Conference (ISMIR) 2014.
P.-S. Huang, M. Kim, M. Hasegawa-Johnson, P. Smaragdis, "Deep Learning for Monaural Speech Separation," in IEEE International Conference on Acoustic, Speech and Signal Processing 2014.

Notes

The codes are tested using MATLAB R2015a

Related Implementations

source_separaton_ml_jeju

同类工具

OpenVINO Toolkit Sparrow manifest captcha-12306 Machine-Learning-Tutorials JavaGPT google-rules-of-machine-learning deeplearningsourceseparation jovian-py

相关阅读

相关文章

相关问答

相关文档