Webpage: https://sites.google.com/site/deeplearningsourceseparation/
Training code: codes/mir1k/train_mir1k_demo.m
Demo
http://www.ifp.illinois.edu/~huang146/DNN_separation/model_400.mat
codes/mir1k/demo
and go to the foldercodes/mir1k/demo/run_test_single_model.m
Training code: codes/timit/train_timit_demo.m
and codes/timit/train_timit_demo_mini_clip.m
Demo
http://www.ifp.illinois.edu/~huang146/DNN_separation/timit_model_70.mat
codes/timit/demo
and go to the foldercodes/timit/demo/run_test_single_model.m
Training code: codes/TSP/train_TSP_demo_mini_clip.m
Demo
http://www.ifp.illinois.edu/~huang146/DNN_separation/TSP_model_RNN1_win1_h300_l2_r0_64ms_1000000_softabs_linearout_RELU_logmel_trn0_c1e-10_c0.001_bsz100000_miter10_bf50_c0_d0_7650.mat
codes/TSP/demo
and go to the foldercodes/TSP/demo/run_test_single_model.m
Put original FCJF0, FDAW0', FDML0, FECD0, 'FETB0', 'FJSP0', 'FKFB0', 'FMEM0', 'FSAH0', 'FSJK1', 'FSMA0', 'FTBR0', 'FVFB0' 'FVMH0
of the original TIMIT data under codes/denoising/Data/timit/
Training code: codes/denoising/train_denoising_demo.m
Demo
http://www.ifp.illinois.edu/~huang146/DNN_separation/denoising_model_870.mat
codes/denoising/demo
and go to the foldercodes/denoising/demo/run_test_single_model.m
The package is modified based on rnn-speech-denoising
The software depends on Mark Schmidt's minFunc package for convex optimization.
Additionally, we have included Mark Hasegawa-Johnson's HTK write and read functionsthat are used to handle the MFCC files.
We use HTK for computing features (MFCC, logmel) (HCopy).
We use signal processing functions from labrosa.
We use BSS Eval toolbox Version 2.0, 3.0 for evaluation.
We use MIR-1K for singing voice separation task.
We use TSP for speech separation task.
To try the codes on your data, see mir1k, TSP settings - put your data into codes/mir1k/Wavfile
or codes/TSP/Data/
accordingly.
Look at the unit test parameters below codes/mir1k/train_mir1k_demo.m
, codes/TSP/train_TSP_demo_mini_clip.m
(with minibatch lbfgs, gradient clipping)
Tune the parameters on the dev set and check the results.
P.-S. Huang, M. Kim, M. Hasegawa-Johnson, P. Smaragdis, "Joint Optimization of Masks and Deep Recurrent Neural Networks for Monaural Source Separation", IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 23, no. 12, pp. 2136–2147, Dec. 2015
P.-S. Huang, M. Kim, M. Hasegawa-Johnson, P. Smaragdis, "Singing-Voice Separation From Monaural Recordings Using Deep Recurrent Neural Networks," in International Society for Music Information Retrieval Conference (ISMIR) 2014.
P.-S. Huang, M. Kim, M. Hasegawa-Johnson, P. Smaragdis, "Deep Learning for Monaural Speech Separation," in IEEE International Conference on Acoustic, Speech and Signal Processing 2014.
The codes are tested using MATLAB R2015a