该部分参考 kalid官方操作指南
Kaldi 没有为我们实现 GOP-GMM,因为 GOP-NN 的性能比 GOP-GMM 好得多。
where s s s is the senone label, s ∣ s ∈ p s|s∈p s∣s∈p is the states belonging to those triphones whose current phone is p p p.
这里 s s s表示senone标记, s ∣ s ∈ p s|s∈p s∣s∈p 是当前音素是 p p p时的那些三音素的状态
python 版本需要注意,默认使用python2.7,可以指定为常用的python3版本
可能系统默认gcc 版本不支持,可以通过指定gcc的版本进行编译,例如,CXX=g++-6 extras/check_dependencies.sh make depend CC=gcc-6 CPP=g++-6 CXX=g++-6 LD=g++-6 make CC=gcc-6 CPP=g++-6 CXX=g++-6 LD=g++-6 -j 8
compute-gop --help
看到如下信息kaldi就安装成功啦~ Congratulations~~~$ compute-gop --help
Compute Goodness Of Pronunciation (GOP) from a matrix of probabilities (e.g. from nnet3-compute).
Usage: compute-gop [options] <model> <alignments-rspecifier> <prob-matrix-rspecifier> <gop-wspecifier> [<phone-feature-wspecifier>]
e.g.:
nnet3-compute [args] | compute-gop 1.mdl ark:ali-phone.1 ark:- ark:gop.1 ark:phone-feat.1
Options:
--log-applied : If true, assume the input probabilities have been applied log. (bool, default = true)
--phone-map : File name containing old->new phone mapping (each line is: old-integer-id new-integer-id) (string, default = "")
--skip_phones_string : Do not write features and gops for those phones (string, default = "0")
Standard options:
--config : Configuration file to read (this option may be repeated) (string, default = "")
--help : Print out usage message (bool, default = false)
--print-args : Print the command line arguments (to stderr) (bool, default = true)
--verbose : Verbose level (higher->more logging) (int, default = 0)
pip install kaldi-io==0.9.4 kaldiio==2.17.2 imblearn -i https://pypi.tuna.tsinghua.edu.cn/simple
```
- 使用开源数据 speechocean762 测试计算gop分数
```bash
# 进入到工作目录
```
$ cd $KALDI_ROOT/egs/gop_speechocean762/s5/
# 把假命令替换成真命令
$ rm -f step
$ cp -r ../../wsj/s5/steps .
$ rm -f utils
$ cp -r ../../wsj/s5/utils .
$ rm -f local/feat_to_score_train.py
$ cp local/tuning/feat_to_score_train_1c.py local/feat_to_score_train.py
$ rm -f utils/run.pl
$ cp utils/parallel/run.pl utils/
# 执行gop计算
$ chmod 777 run.sh
$ ./run.sh
...
steps/align_mapped.sh: done aligning data.
The features are visualized and saved in exp/gop_train/feats.png
MSE: 0.70
Corr: 0.23
precision recall f1-score support
0 0.24 0.31 0.27 1412
1 0.05 0.78 0.09 1860
2 0.99 0.36 0.53 44097
accuracy 0.37 47369
macro avg 0.43 0.48 0.30 47369
weighted avg 0.93 0.37 0.50 47369
# 看到这里 恭喜您已经完成了本次实验。
```