当前位置: 首页 > 工具软件 > Rouge > 使用案例 >

pyrouge和rouge在Linux上的安装方法以及结果比较

卢俊发
2023-12-01

诸神缄默不语-个人CSDN博文目录

这里的pyrouge安装的是这个:pyrouge · PyPI,也就是这个项目:bheinzerling/pyrouge: A Python wrapper for the ROUGE summarization evaluation package
细节稍后再补,先把主要内容写上。

总之非常麻烦,安装和运行都很麻烦。不如用rouge包(pltrdy/rouge: A full Python Implementation of the ROUGE Metric (not a wrapper))。
rouge包两种安装方法都很简单,用源的话:

git clone git://github.com/pltrdy/rouge
cd rouge
python setup.py install

也可以直接用pip:pip install rouge

以下介绍pyrouge包的安装方法,首先安装ROUGE-1.5.5,然后安装pyrouge包:
(以下路径全部建议使用绝对路径)
ROUGE-1.5.5.tgz文件来自:https://pan.baidu.com/s/1qXQpBp6(来自ROUG安装配置,终于在两台linux和一台Mac上成功安装ROUGE,完美解决各种问题_qingjuanzhao的博客-CSDN博客),因为andersjo/pyrouge: An interface to and, in time, a Python reimplementation of the ROUGE package for evaluating summarization里面的文件不够。此外文件也可以用fastSum: 包含多个模型和数据集的文本摘要项目里面的resources/ROUGE/RELEASE-1.5.5文件夹。
必须要管理员权限,没有想办法吧。

cpan -v
sudo cpan install XML::DOM
runROUGE-test.pl(文件来自上面)
pip install pyrouge
pyrouge_set_rouge_path RELEASE-1.5.5文件夹路径

python -m pyrouge.test会报错,参考网上的解决方案改了之后还是会。但是代码能跑。

以下用一个简单的例子来比较两个包的运行结果:
随便给个示例(注意中文字符会报错,是正常的):
trys/pyrouge_models路径下:(真实摘要)
001_candidate.txt: 0 1 2 3 4
002_candidate.txt: 0 1 2 3 4 5 6 7
trys/pyrouge_systems路径下:(预测摘要)
001_reference.txt: 0 1 2 3 4 5
002_reference.txt: 0 1 2 3 4 5
然后运行代码。这里同时拿rouge(pltrdy/rouge: A full Python Implementation of the ROUGE Metric (not a wrapper))包的结果比了一下:

from pyrouge import Rouge155

r = Rouge155()
r.system_dir = 'trys/pyrouge_systems'
r.model_dir = 'trys/pyrouge_models'
r.model_filename_pattern = '(\d+)_candidate.txt'
r.system_filename_pattern = '(\d+)_reference.txt'

output = r.convert_and_evaluate()
print(output)
output_dict = r.output_to_dict(output)

from rouge import Rouge
rouge = Rouge()
refs=['0 1 2 3 4','0 1 2 3 4 5 6 7']  #真实值。rouge包支持中文,这里用refs=['我 不 是 黄 蓉','红 橙 黄 绿 青 蓝 紫 。']的代码一样
hyps=['0 1 2 3 4 5','0 1 2 3 4 5']  #预测值。同上,hyps=['我 不 是 黄 蓉 啊','红 橙 黄 绿 青 蓝']
scores=rouge.get_scores(hyps,refs,avg=True)
print(scores)

输出:

2022-05-14 09:41:13,118 [MainThread  ] [INFO ]  Writing summaries.
2022-05-14 09:41:13,118 [MainThread  ] [INFO ]  Processing summaries. Saving system files to /tmp/tmpq7ugz254/system and model files to /tmp/tmpq7ugz254/model.
2022-05-14 09:41:13,118 [MainThread  ] [INFO ]  Processing files in trys/pyrouge_systems.
2022-05-14 09:41:13,118 [MainThread  ] [INFO ]  Processing 001_reference.txt.
2022-05-14 09:41:13,118 [MainThread  ] [INFO ]  Processing 002_reference.txt.
2022-05-14 09:41:13,119 [MainThread  ] [INFO ]  Saved processed files to /tmp/tmpq7ugz254/system.
2022-05-14 09:41:13,119 [MainThread  ] [INFO ]  Processing files in trys/pyrouge_models.
2022-05-14 09:41:13,119 [MainThread  ] [INFO ]  Processing 001_candidate.txt.
2022-05-14 09:41:13,119 [MainThread  ] [INFO ]  Processing 002_candidate.txt.
2022-05-14 09:41:13,119 [MainThread  ] [INFO ]  Saved processed files to /tmp/tmpq7ugz254/model.
2022-05-14 09:41:13,119 [MainThread  ] [INFO ]  Written ROUGE configuration to /tmp/tmp49jc1wnw/rouge_conf.xml
2022-05-14 09:41:13,119 [MainThread  ] [INFO ]  Running ROUGE with command fastSum/fastSum/resources/ROUGE/RELEASE-1.5.5/ROUGE-1.5.5.pl -e astSum/fastSum/resources/ROUGE/RELEASE-1.5.5/data -c 95 -2 -1 -U -r 1000 -n 4 -w 1.2 -a -m /tmp/tmp49jc1wnw/rouge_conf.xml
---------------------------------------------
1 ROUGE-1 Average_R: 0.84615 (95%-conf.int. 0.84615 - 0.84615)
1 ROUGE-1 Average_P: 0.91667 (95%-conf.int. 0.91667 - 0.91667)
1 ROUGE-1 Average_F: 0.88000 (95%-conf.int. 0.88000 - 0.88000)
---------------------------------------------
1 ROUGE-2 Average_R: 0.81818 (95%-conf.int. 0.81818 - 0.81818)
1 ROUGE-2 Average_P: 0.90000 (95%-conf.int. 0.90000 - 0.90000)
1 ROUGE-2 Average_F: 0.85714 (95%-conf.int. 0.85714 - 0.85714)
---------------------------------------------
1 ROUGE-3 Average_R: 0.77778 (95%-conf.int. 0.77778 - 0.77778)
1 ROUGE-3 Average_P: 0.87500 (95%-conf.int. 0.87500 - 0.87500)
1 ROUGE-3 Average_F: 0.82353 (95%-conf.int. 0.82353 - 0.82353)
---------------------------------------------
1 ROUGE-4 Average_R: 0.71429 (95%-conf.int. 0.71429 - 0.71429)
1 ROUGE-4 Average_P: 0.83333 (95%-conf.int. 0.83333 - 0.83333)
1 ROUGE-4 Average_F: 0.76923 (95%-conf.int. 0.76923 - 0.76923)
---------------------------------------------
1 ROUGE-L Average_R: 0.84615 (95%-conf.int. 0.84615 - 0.84615)
1 ROUGE-L Average_P: 0.91667 (95%-conf.int. 0.91667 - 0.91667)
1 ROUGE-L Average_F: 0.88000 (95%-conf.int. 0.88000 - 0.88000)
---------------------------------------------
1 ROUGE-W-1.2 Average_R: 0.57431 (95%-conf.int. 0.57431 - 0.57431)
1 ROUGE-W-1.2 Average_P: 0.91742 (95%-conf.int. 0.91742 - 0.91742)
1 ROUGE-W-1.2 Average_F: 0.70641 (95%-conf.int. 0.70641 - 0.70641)
---------------------------------------------
1 ROUGE-S* Average_R: 0.65789 (95%-conf.int. 0.65789 - 0.65789)
1 ROUGE-S* Average_P: 0.83333 (95%-conf.int. 0.83333 - 0.83333)
1 ROUGE-S* Average_F: 0.73529 (95%-conf.int. 0.73529 - 0.73529)
---------------------------------------------
1 ROUGE-SU* Average_R: 0.69388 (95%-conf.int. 0.69388 - 0.69388)
1 ROUGE-SU* Average_P: 0.85000 (95%-conf.int. 0.85000 - 0.85000)
1 ROUGE-SU* Average_F: 0.76405 (95%-conf.int. 0.76405 - 0.76405)

{'rouge-1': {'r': 0.875, 'p': 0.9166666666666667, 'f': 0.8831168781885648}, 'rouge-2': {'r': 0.8571428571428572, 'p': 0.9, 'f': 0.8611111062114198}, 'rouge-l': {'r': 0.875, 'p': 0.9166666666666667, 'f': 0.8831168781885648}}

差别还是挺大的,整的我很困惑。按算法来说rouge-1-r应该是(5/5+6/8)/2=(1+0.75)/2=0.875,rouge-1-p应该是(5/6+6/6)/2=(0.833+1)/2=0.917,rouge-1-f就自然是(2*1*0.833/(1+0.833)+2*0.75*1/(0.75+1))/2=0.883,所以rouge包算的应该是对的。
其他:rouge-2-r=(1+5/7)/2=0.857, rouge-2-p=(4/5+1)/2=(0.8+1)/20.9, rouge-2-f=(2*1*0.8/(1+0.8)+(2*5/7*1)/(5/7+1))/2=0.861
rouge-L-r=(1+6/8)/2=(1+0.75)/2=0.875, rouge-L-p=(5/6+1)/2=0.917, rouge-L-f=(2*1*5/6/(1+5/6)+2*0.75*1/(0.75+1))/2=0.883
这样的话rouge包就是对的,pyrouge包的算法就是错的。我也不知道ROUGE-1.5.5到底是拿啥算的!我不会PERL语言啊!都2022年了谁还用PERL语言啊!反正建议用rouge包!

方法参考自:安装Rouge1.5.5 - Zzxn’s Blog

其他文中未提及但是我看过的参考资料:

  1. text summarization(文本摘要) rouge打分 - 知乎
  2. rouge 及 pyrouge 安装、配置和使用_vivid_blog的博客-CSDN博客_pyrouge
  3. python中rouge是什么程序_rouge 及 pyrouge 安装、配置和使用_weixin_39567943的博客-CSDN博客
  4. 在Linux环境下配置pyrouge_仲夏199603的博客-CSDN博客_pyrouge安装
  5. 文本摘要评测工具ROUGE的搭建和测试_sparkexpert的博客-CSDN博客
  6. ubuntu16.04 安装配置 pyrouge 的方法_taoyafan的博客-CSDN博客_pyrouge安装
  7. 在Ubuntu下配置pyrouge_Merry的技术历程的博客-CSDN博客
  8. macOS / Ubuntu下安装ROUGE及pyrouge 踩坑分享_Johnson Guo的博客-CSDN博客
  9. 自动文摘评测方法:Rouge-1、Rouge-2、Rouge-L、Rouge-S_Jayson365的博客-CSDN博客_rouge-l
  10. Precision, Recall, F1-score簡單介紹. 給機器學習模型打分數:準確率(Precision)、召回率(Recall)、F1… | by CHEN TSU PEI | NLP-trend-and-review | Medium
 类似资料: