当前位置: 首页 > 知识库问答 >
问题:

德国corenlp模型默认为英国模型

郜昊苍
2023-03-14

我使用以下命令为德语语言模型提供corenlp服务器,这些模型在类路径中作为jar下载,但它不输出德语标记或解析,只加载英语模型:

 java -mx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer   -props ./german.prop

German.Prop内容:

annotators = tokenize, ssplit, pos, depparse, parse

tokenize.language = de

pos.model = edu/stanford/nlp/models/pos-tagger/german/german-hgc.tagger

ner.model = edu/stanford/nlp/models/ner/german.hgc_175m_600.crf.ser.gz
ner.applyNumericClassifiers = false
ner.useSUTime = false

parse.model = edu/stanford/nlp/models/lexparser/germanFactored.ser.gz
depparse.model = edu/stanford/nlp/models/parser/nndep/UD_German.gz
wget --post-data ' Meine Mutter ist aus Wuppertal' 'localhost:9000/?properties"="{"tokenize.whitespace":"true","annotators":"tokenize, ssplit, pos, depparse, parse","outputFormat":"text","tokenize.language" :"de" ,
 "pos.model":" edu/stanford/nlp/models/pos-tagger/german/german-hgc.tagger",
"depparse.model" : "edu/stanford/nlp/models/parser/nndep/UD_German.gz",
"parse.model" : "edu/stanford/nlp/models/lexparser/germanFactored.ser.gz"

 }' -O -
 {"dep":"dep","governor":4,"governorGloss":"aus","dependent":5,"dependentGloss":"Wuppertal"}],"openie":[{"subject":"Wuppertal","subjectSpan":[4,5],"relation":"is ist aus of","relationSpan":[2,4],"object":"Meine Mutter","objectSpan":[0,2]}],"tokens":[{"index":1,"word":"Meine","originalText":"Meine","lemma":"Meine","characterOffsetBegin":1,"characterOffsetEnd":6,"pos":"NNP","ner":"PERSON","speaker":"PER0","before":" ","after":" "},{"index":2,"word":"Mutter","originalText":"Mutter","lemma":"Mutter","characterOffsetBegin":7,"characterOffsetEnd":13,"pos":"NNP","ner":"PERSON","speaker":"PER0","before":" ","after":" "},{"index":3,"word":"ist","originalText":"ist","lemma":"ist","characterOffsetBegin":14,"characterOffsetEnd":17,"pos":"NN","ner":"O","speaker":"PER0","before":" ","after":" "},{"index":4,"word":"aus","originalText":"aus","lemma":"aus","characterOffsetBegin":18,"characterOffsetEnd":21,"pos":"NN","ner":"O","speaker":"PER0","before":" ","after":" "},{"index":5,"word":"Wuppertal","originalText":"Wuppertal","lemma":"Wuppertal","characterOffsetBegin":22,"characterOffsetEnd":31,"pos":"NNP","ner":"LOCATI100%[==========================================================================>] 2,
pos.model=edu/stanford/nlp/models/pos-tagger/ge...
parse.model=edu/stanford/nlp/models/lexparser/ger...
tokenize.language=de
depparse.model=edu/stanford/nlp/models/parser/nndep/...
annotators=tokenize, ssplit, pos, depparse, parse
Starting server on port 9000 with timeout of 5000 milliseconds.
StanfordCoreNLPServer listening at /0:0:0:0:0:0:0:0:9000
[/203.:61563] API call w/annotators tokenize,ssplit,pos,depparse
Die Katze liegt auf der Matte.
[pool-1-thread-1] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator tokenize
[pool-1-thread-1] INFO edu.stanford.nlp.pipeline.TokenizerAnnotator - TokenizerAnnotator: No tokenizer type provided. Defaulting to PTBTokenizer.
[pool-1-thread-1] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ssplit
[pool-1-thread-1] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator pos
Reading POS tagger model from edu/stanford/nlp/models/pos-tagger/english-left3words/english-left3words-distsim.tagger ... done [1.5 sec].
[pool-1-thread-1] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator depparse
Loading depparse model file: edu/stanford/nlp/models/parser/nndep/english_UD.gz ...
PreComputed 100000, Elapsed Time: 1.396 (s)

共有1个答案

秦凯旋
2023-03-14

在让外语材料在服务器上工作时出现了一些问题。

如果您使用我们GitHub站点上的最新版本,它应该可以工作。

GitHub站点如下:https://GitHub.com/stanfordnlp/corenlp

wget --post-data '<sample german text>' 'localhost:9000/?properties={"pipelineLanguage":"german","annotators":"tokenize,ssplit,pos,ner,parse", "parse.model":"edu/stanford/nlp/models/lexparser/germanFactored.ser.gz","tokenize.language":"de","pos.model":"edu/stanford/nlp/models/pos-tagger/german/german-hgc.tagger", "ner.model":"edu/stanford/nlp/models/ner/german.hgc_175m_600.crf.ser.gz", "ner.applyNumericClassifiers":"false", "ner.useSUTime":"false"}' -O -

有关该服务器的更多信息,请参见:http://stanfordnlp.github.io/corenlp/corenlp-server.html

 类似资料:
  • 问题内容: 当我在StanfordCoreNLP对象管道中添加“ ner”注释器时,我可以看到它加载了3个模型,这需要很多时间: 有没有一种方法可以加载同样工作的子集?特别是,我不确定为什么当它具有7级模型时为什么要加载3级和4级NER模型,并且我想知道是否不加载这两种仍然可以工作。 问题答案: 您可以设置以这种方式加载哪些模型: 命令行: Java代码: 其中model_path1和model_

  • 问题内容: 我有以下模型代码: 但是我希望,它将生成类似 这没有发生,当我运行它时会产生: 深入研究的代码并进行谷歌搜索没有给我任何好处,但是James Bennet的注释不被认为会影响生成,但是Django管理员需要它。即使是这样,我如何获得理想的效果? 我的版本是1.3.0最终版 问题答案: 请注意,该参数也可以采用可调用对象:https : //docs.djangoproject.com/

  • 我在JavaFX中尝试从表视图中选择单行时遇到了一些问题。 以下是我如何使用数据库中的数据填充表: 但是,当我尝试插入setSelectionMode代码时,出现错误。它告诉我找不到符号符号:方法setSelectionMode(int)位置:class TableView 我记得当我在JavaSwing中创建table时,我用它为table设置了一个模型:DefaultTableModel ta

  • 问题内容: 假设我有一个模型: 目前,我正在使用默认的admin创建/编辑此类型的对象。如何从管理员中删除该字段,以使每个对象都无法使用值创建,而是将接收默认值? 问题答案: 设置为和为默认值。 http://docs.djangoproject.com/en/dev/ref/models/fields/#editable 另外,你的字段是不必要的。Django将自动添加它。

  • 假设您有一个表单,其中包含从数据库加载的值。如何初始化ng模型? 例子: 在我的控制器中,$scope。卡最初未定义。除了做这样的事还有别的办法吗?

  • 关于json字符串,我遇到了一个问题,我是通过Apache http客户端获得的,它包含德国的UMLAUT。 json字符串的映射仅在字符串不包含任何德语umlaut的情况下有效,否则我会得到一个“JsonMappingException:无法反序列化[…]的实例”超出起始数组。 ApacheHTTP客户端将“Accept Charset”设置为http。UTF-8,但结果我总是得到例如“\u00