Java中的Wordnet相似性：JAWS，JWNL或Java WN ::相似性？

罗昱

2023-03-14

问题内容：

我需要在基于Java的应用程序中使用Wordnet。我想要：

搜索同义词集
找到同义词集之间的相似性/相关性

我的应用程序使用RDF图，我知道Wordnet中有SPARQL端点，但是我想最好有一个数据集的本地副本，因为它不是太大。

我发现以下罐子：

通用库 -JAWS http://lyle.smu.edu/~tspell/jaws/index.html
通用库 -JWNL http://sourceforge.net/projects/jwordnet
相似度库（Perl） -Wordnet ::相似度 http://wn-similarity.sourceforge.net/
Java版本的Wordnet ::相似性http://www.cogs.susx.ac.uk/users/drh21/（测试版）

您对我的应用有什么建议？

是否可以通过一些绑定从Java应用程序使用Perl库？

谢谢！木兰

问题答案：

我将JAWS用于普通的wordnet东西，因为它易于使用。不过，对于相似性指标，我使用此处的库。您还需要下载此文件夹，其中包含经过预处理的WordNet和语料库数据，才能正常工作。假设您将该文件夹放置在项目文件夹中的另一个名为“
lib”的文件夹中，则可以像这样使用代码：

JWS ws = new JWS("./lib", "3.0");
Resnik res = ws.getResnik();
TreeMap<String, Double> scores1 = res.res(word1, word2, partOfSpeech);
for(Entry<String, Double> e: scores1.entrySet())
    System.out.println(e.getKey() + "\t" + e.getValue());
System.out.println("\nhighest score\t=\t" + res.max(word1, word2, partOfSpeech) + "\n\n\n");

这将打印如下内容，显示由要比较的单词表示的同义词集的每种可能组合之间的相似性得分：

hobby#n#1,gardening#n#1 2.6043996588901104
hobby#n#2,gardening#n#1 -0.0
hobby#n#3,gardening#n#1 -0.0
highest score   =   2.6043996588901104

还有一些方法可以让您指定两个单词中的任一个或两个：res(String word1, int senseNum1, String word2, partOfSpeech)等等。不幸的是，源文档不是JavaDoc，因此您需要手动检查它。可以在这里下载源html" target="_blank">代码。

可用的算法有：

JWSRandom(ws.getDictionary(), true, 16.0);//random number for baseline
Resnik res = ws.getResnik();
LeacockAndChodorowlch = ws.getLeacockAndChodorow();
AdaptedLesk adLesk = ws.getAdaptedLesk();
AdaptedLeskTanimoto alt = ws.getAdaptedLeskTanimoto();
AdaptedLeskTanimotoNoHyponyms altnh = ws.getAdaptedLeskTanimotoNoHyponyms();
HirstAndStOnge hso = ws.getHirstAndStOnge();
JiangAndConrath jcn = ws.getJiangAndConrath();
Lin lin = ws.getLin();
WuAndPalmer wup = ws.getWuAndPalmer();

另外，它要求您拥有MIT的JWI的jar文件

Java中的Wordnet相似性：JAWS，JWNL或Java WN ::相似性？

相关阅读

相关文章

相关问答

相关工具

相关文档