当前位置: 首页 > 知识库问答 >
问题:

OWLAPI错误地使用OBO解析器处理n-三元文件

严易安
2023-03-14

我们有一个使用OWLAPI解析本体的包装器。

我们试图在这里解析它:https://github.com/ncbo/owlapiwrapper/blob/master/src/main/java/org/stanford/ncbo/oapiwrapper/ontologyparser.java#L637

我们面临两个案例:

>

  • 运行mvn测试时:解析工作正常

    我尝试了多种方法:

    >

  • 禁止OBO解析器。我尝试了多种语法,但没有一种奏效;包装器继续使用OBO解析器)

    conf.setBannedParsers("org.obolibrary.oboformat.parser.OBOFormatParser");
    conf.setBannedParsers("o.o.oboformat.parser.OBOFormatParser");
    conf.setBannedParsers("OBOFormatParser");
    

    避免使用不同的owlapi依赖项。与这里的文档一样,owlapi:Parser not found,如果从Jar运行,我尝试只使用owlapi-distribution来避免任何冲突

    <dependency>
      <groupId>net.sourceforge.owlapi</groupId>
      <artifactId>owlapi-distribution</artifactId>
      <version>4.3.1</version>
    </dependency>
    
    • 应用程序打包为jar(包含依赖项)并且运行jar解析NTriple文件时。则返回org.semanticweb.owlapi.rdf.turtle.parser.ParseException失败:遇到“ ”:genid1“”

    导致问题的三个因素是: _:genid1.

    • 当通过Maven测试在完全相同的文件上运行完全相同的解析器时(解析器是通过jUnit测试调用的,我们使用mvn test运行测试)。那么解析就会进行得很好。并成功提取了_:genid1节点给出的信息。

    在第一种情况下,OWLAPI似乎无法解析空节点。在运行LoadOntologyFromontologyDocument之前,我使用VersionInfo.getVersionInfo()打印了OWLAPI版本:

    • 对于jar版本(导致问题的原因):OWL API(版本4.3.1)
    • 测试版本(正在工作):OWL API(版本4.3.1.2017-03-27T22:32:37Z)

    更新2:

    看来问题出在罐子楼。

    在构建jar时,会覆盖一些依赖项,因此conf文件中并不包括所有的解析器

    org.semanticweb.owlapi.rio.RioFunctionalSyntaxParserFactory
    org.semanticweb.owlapi.rio.RioManchesterSyntaxParserFactory
    org.semanticweb.owlapi.rio.RioOWLXMLParserFactory
    org.semanticweb.owlapi.rio.RioFunctionalSyntaxParserFactory
    org.semanticweb.owlapi.rio.RioManchesterSyntaxParserFactory
    org.semanticweb.owlapi.rio.RioOWLXMLParserFactory
    

    对于有空节点的NTriples文件,我们得到:

    The following parsers were tried:
    1) org.semanticweb.owlapi.rdf.rdfxml.parser.RDFXMLParser@a4add54
    2) org.semanticweb.owlapi.owlxml.parser.OWLXMLParser@71454b9d
    3) org.semanticweb.owlapi.functional.parser.OWLFunctionalSyntaxOWLParser@67304a40
    4) org.semanticweb.owlapi.rio.RioParserImpl : org.semanticweb.owlapi.formats.RioTurtleDocumentFormatFactory@95fd655c
    5) org.semanticweb.owlapi.manchestersyntax.parser.ManchesterOWLSyntaxOntologyParser@61c9c3fd
    6) org.semanticweb.owlapi.rio.RioParserImpl : org.semanticweb.owlapi.formats.NQuadsDocumentFormatFactory@6f9c39ad
    7) org.semanticweb.owlapi.rio.RioParserImpl : org.semanticweb.owlapi.formats.RDFJsonDocumentFormatFactory@cd748dc3
    8) org.semanticweb.owlapi.rio.RioParserImpl : org.semanticweb.owlapi.formats.NTriplesDocumentFormatFactory@937ecd36
    9) org.semanticweb.owlapi.rio.RioParserImpl : org.semanticweb.owlapi.formats.TrigDocumentFormatFactory@27e81c
    10) org.semanticweb.owlapi.rio.RioParserImpl : org.semanticweb.owlapi.formats.BinaryRDFDocumentFormatFactory@3bf24493
    11) org.semanticweb.owlapi.rio.RioParserImpl : org.semanticweb.owlapi.formats.RDFJsonLDDocumentFormatFactory@dcacc47d
    12) org.semanticweb.owlapi.rio.RioParserImpl : org.semanticweb.owlapi.formats.N3DocumentFormatFactory@9a5
    13) org.semanticweb.owlapi.rio.RioParserImpl : org.semanticweb.owlapi.formats.RioRDFXMLDocumentFormatFactory@69b9a3bc
    14) org.semanticweb.owlapi.rio.RioTrixParserFactory$TrixParserImpl : org.semanticweb.owlapi.formats.TrixDocumentFormatFactory@27e82d
    15) org.semanticweb.owlapi.rdf.turtle.parser.TurtleOntologyParser@463b4ac8
    16) org.semanticweb.owlapi.krss2.parser.KRSS2OWLParser@11981797
    17) org.semanticweb.owlapi.rio.RioParserImpl : org.semanticweb.owlapi.formats.RDFaDocumentFormatFactory@264e8d
    

    但对于RioTurtleDocumentFormat,它说:

    Parser: org.semanticweb.owlapi.rio.RioParserImpl : org.semanticweb.owlapi.formats.RioTurtleDocumentFormatFactory@95fd655c
        Stack trace:
    org.openrdf.rio.UnsupportedRDFormatException: No parser factory available for RDF format Turtle (mimeTypes=text/turtle, application/x-turtle; ext=ttl)        
    org.semanticweb.owlapi.rio.RioParserImpl.parse(RioParserImpl.java:207)
    

    因此,RioturtleDocumentFormatFactory似乎没有正确地包含在JAR中。

    我还尝试使用maven-shade-plugin对jar进行打包,得到了同样的错误。

    在禁止OBO解析器之后,log说它试图使用这些解析器解析文件:

    The following parsers were tried:
    1) org.semanticweb.owlapi.rdf.rdfxml.parser.RDFXMLParser@2bb3058
    2) org.semanticweb.owlapi.owlxml.parser.OWLXMLParser@6bbe2511
    3) org.semanticweb.owlapi.functional.parser.OWLFunctionalSyntaxOWLParser@93cf163
    4) org.semanticweb.owlapi.rio.RioParserImpl : org.semanticweb.owlapi.formats.RioTurtleDocumentFormatFactory@95fd655c
    5) org.semanticweb.owlapi.manchestersyntax.parser.ManchesterOWLSyntaxOntologyParser@3d97a632
    6) org.semanticweb.owlapi.rio.RioParserImpl : org.semanticweb.owlapi.formats.NQuadsDocumentFormatFactory@6f9c39ad
    7) org.semanticweb.owlapi.rio.RioParserImpl : org.semanticweb.owlapi.formats.RDFJsonDocumentFormatFactory@cd748dc3
    8) org.semanticweb.owlapi.rio.RioParserImpl : org.semanticweb.owlapi.formats.NTriplesDocumentFormatFactory@937ecd36
    9) org.semanticweb.owlapi.rio.RioParserImpl : org.semanticweb.owlapi.formats.TrigDocumentFormatFactory@27e81c
    10) org.semanticweb.owlapi.rio.RioParserImpl : org.semanticweb.owlapi.formats.BinaryRDFDocumentFormatFactory@3bf24493
    11) org.semanticweb.owlapi.rio.RioParserImpl : org.semanticweb.owlapi.formats.RDFJsonLDDocumentFormatFactory@dcacc47d
    12) org.semanticweb.owlapi.rio.RioParserImpl : org.semanticweb.owlapi.formats.N3DocumentFormatFactory@9a5
    13) org.semanticweb.owlapi.rio.RioParserImpl : org.semanticweb.owlapi.formats.RioRDFXMLDocumentFormatFactory@69b9a3bc
    14) org.semanticweb.owlapi.rio.RioTrixParserFactory$TrixParserImpl : org.semanticweb.owlapi.formats.TrixDocumentFormatFactory@27e82d
    15) org.semanticweb.owlapi.rdf.turtle.parser.TurtleOntologyParser@784b990c
    16) org.semanticweb.owlapi.krss2.parser.KRSS2OWLParser@13f17eb4
    17) org.semanticweb.owlapi.rio.RioParserImpl : org.semanticweb.owlapi.formats.RDFaDocumentFormatFactory@264e8d
    

    以下是RioTurtleDocumentFormatFactory的错误日志

    --------------------------------------------------------------------------------
    Parser: org.semanticweb.owlapi.rio.RioParserImpl : org.semanticweb.owlapi.formats.RioTurtleDocumentFormatFactory@95fd655c
        Stack trace:
    org.openrdf.rio.UnsupportedRDFormatException: No parser factory available for RDF format Turtle (mimeTypes=text/turtle, application/x-turtle; ext=ttl)        org.semanticweb.owlapi.rio.RioParserImpl.parse(RioParserImpl.java:207)
            uk.ac.manchester.cs.owl.owlapi.OWLOntologyFactoryImpl.loadOWLOntology(OWLOntologyFactoryImpl.java:197)
            uk.ac.manchester.cs.owl.owlapi.OWLOntologyManagerImpl.actualParse(OWLOntologyManagerImpl.java:1156)
            uk.ac.manchester.cs.owl.owlapi.OWLOntologyManagerImpl.loadOntology(OWLOntologyManagerImpl.java:1112)
            uk.ac.manchester.cs.owl.owlapi.OWLOntologyManagerImpl.loadOntologyFromOntologyDocument(OWLOntologyManagerImpl.java:1068)
            org.stanford.ncbo.oapiwrapper.OntologyParser.findMasterFile(OntologyParser.java:708)
            org.stanford.ncbo.oapiwrapper.OntologyParser.internalParse(OntologyParser.java:651)
            org.stanford.ncbo.oapiwrapper.OntologyParser.parse(OntologyParser.java:630)
            org.stanford.ncbo.oapiwrapper.OntologyParserCommand.main(OntologyParserCommand.java:51)
    No parser factory available for RDF format Turtle (mimeTypes=text/turtle, application/x-turtle; ext=ttl)        org.openrdf.rio.Rio.createParser(Rio.java:198)
            org.semanticweb.owlapi.rio.RioParserImpl.parseDocumentSource(RioParserImpl.java:241)
            org.semanticweb.owlapi.rio.RioParserImpl.parse(RioParserImpl.java:191)
            uk.ac.manchester.cs.owl.owlapi.OWLOntologyFactoryImpl.loadOWLOntology(OWLOntologyFactoryImpl.java:197)
            uk.ac.manchester.cs.owl.owlapi.OWLOntologyManagerImpl.actualParse(OWLOntologyManagerImpl.java:1156)
            uk.ac.manchester.cs.owl.owlapi.OWLOntologyManagerImpl.loadOntology(OWLOntologyManagerImpl.java:1112)
            uk.ac.manchester.cs.owl.owlapi.OWLOntologyManagerImpl.loadOntologyFromOntologyDocument(OWLOntologyManagerImpl.java:1068)
            org.stanford.ncbo.oapiwrapper.OntologyParser.findMasterFile(OntologyParser.java:708)
            org.stanford.ncbo.oapiwrapper.OntologyParser.internalParse(OntologyParser.java:651)
            org.stanford.ncbo.oapiwrapper.OntologyParser.parse(OntologyParser.java:630)
    
    RioTurtleDocumentFormat.class
    RioTurtleDocumentFormatFactory.class
    RioTurtleParserFactory.class
    RioTurtleStorerFactory.class
    
    META-INF/services/org.openrdf.rio.RDFParserFactory
    META-INF/services/org.semanticweb.owlapi.io.LegacyOWLParserFactory
    META-INF/services/org.semanticweb.owlapi.model.OWLOntologyManagerFactory
    META-INF/services/org.semanticweb.owlapi.io.OWLParserFactory
    META-INF/services/org.semanticweb.owlapi.model.OWLStorerFactory
    META-INF/services/org.semanticweb.owlapi.model.OWLDocumentFormatFactory
    META-INF/services/org.openrdf.rio.LanguageHandler
    META-INF/services/org.openrdf.rio.DatatypeHandler
    META-INF/services/org.openrdf.rio.RDFWriterFactory
    META-INF/services/com.fasterxml.jackson.core.JsonFactory
    META-INF/services/com.fasterxml.jackson.core.ObjectCodec
    META-INF/services/org.apache.commons.logging.LogFactory
    META-INF/services/javax.servlet.ServletContainerInitializer
    
    org.semanticweb.owlapi.rio.RioFunctionalSyntaxParserFactory
    org.semanticweb.owlapi.rio.RioManchesterSyntaxParserFactory
    org.semanticweb.owlapi.rio.RioOWLXMLParserFactory
    org.semanticweb.owlapi.rio.RioFunctionalSyntaxParserFactory
    org.semanticweb.owlapi.rio.RioManchesterSyntaxParserFactory
    org.semanticweb.owlapi.rio.RioOWLXMLParserFactory
    
    org.semanticweb.owlapi.formats.BinaryRDFDocumentFormatFactory
    org.semanticweb.owlapi.formats.N3DocumentFormatFactory
    org.semanticweb.owlapi.formats.NQuadsDocumentFormatFactory
    org.semanticweb.owlapi.formats.NTriplesDocumentFormatFactory
    org.semanticweb.owlapi.formats.RDFaDocumentFormatFactory
    org.semanticweb.owlapi.formats.RDFJsonLDDocumentFormatFactory
    org.semanticweb.owlapi.formats.RDFJsonDocumentFormatFactory
    org.semanticweb.owlapi.formats.RioRDFXMLDocumentFormatFactory
    org.semanticweb.owlapi.formats.RioTurtleDocumentFormatFactory
    org.semanticweb.owlapi.formats.TrigDocumentFormatFactory
    org.semanticweb.owlapi.formats.TrixDocumentFormatFactory
    org.semanticweb.owlapi.formats.BinaryRDFDocumentFormatFactory
    org.semanticweb.owlapi.formats.N3DocumentFormatFactory
    org.semanticweb.owlapi.formats.NQuadsDocumentFormatFactory
    org.semanticweb.owlapi.formats.NTriplesDocumentFormatFactory
    org.semanticweb.owlapi.formats.RDFaDocumentFormatFactory
    org.semanticweb.owlapi.formats.RDFJsonLDDocumentFormatFactory
    org.semanticweb.owlapi.formats.RDFJsonDocumentFormatFactory
    org.semanticweb.owlapi.formats.RioRDFXMLDocumentFormatFactory
    org.semanticweb.owlapi.formats.RioTurtleDocumentFormatFactory
    org.semanticweb.owlapi.formats.TrigDocumentFormatFactory
    org.semanticweb.owlapi.formats.TrixDocumentFormatFactory
    

    因此org.semanticweb.owlapi.formats.RioturtleDocumentFormatFactory实际上列在一些Meta-INF/Services文件中,并且类包含在JAR中。但看起来罐子还是找不到。

    我真的不明白OWLAPI是如何定义使用哪个解析器以及在哪里找到它们的。

    更新4:

    <plugin>
      <groupId>org.apache.maven.plugins</groupId>
      <artifactId>maven-shade-plugin</artifactId>
      <version>2.3</version>
      <executions>
        <execution>
          <phase>package</phase>
          <goals>
            <goal>shade</goal>
          </goals>
          <configuration>
            <artifactSet>
              <includes>
                <include>net.sourceforge.owlapi:owlapi-api</include>
                <include>net.sourceforge.owlapi:owlapi-apibinding</include>
                <include>net.sourceforge.owlapi:owlapi-fixers</include>
                <include>net.sourceforge.owlapi:owlapi-impl</include>
                <include>net.sourceforge.owlapi:owlapi-oboformat</include>
                <include>net.sourceforge.owlapi:owlapi-parsers</include>
                <include>net.sourceforge.owlapi:owlapi-rio</include>
                <include>net.sourceforge.owlapi:owlapi-tools</include>
                <include>commons-cli:*</include>
                <include>commons-io:*</include>
                <include>org.slf4j:*</include>
                <include>net.sourceforge.owlapi:owlapi-osgidistribution</include>
                <include>com.google.inject:*</include>
                <include>javax.inject:*</include>
                <include>com.google.*</include>
                <include>aopalliance:*</include>
                <include>org.openrdf.sesame:*</include>
                <include>org.tukaani:*</include>
                <include>net.sf.trove4j:*</include>
                <include>org.apache.commons:commons-csv</include>
              </includes>
            </artifactSet>
            <transformers>
              <transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
                <mainClass>org.stanford.ncbo.oapiwrapper.OntologyParserCommand</mainClass>
              </transformer>
            </transformers>
          </configuration>
        </execution>
      </executions>
    </plugin>
    

    但它不会改变jar中的meta-inf/services/org.openrdf.rio.rdfParserFactory文件中的任何内容

    可能是因为我需要添加 net.sourceforge.owlapi:owlapi-osgidistribute ,这会覆盖RDFParserFactory文件。但没有包含它,我得到了一个java.lang.noClassDefoundError:org/semanticweb/owlapi/model/owlannotationvalue

  • 共有1个答案

    公羊兴文
    2023-03-14

    这里有几个问题合谋:

    • OwlonTologyLoaderConfiguration是一个不可变的类。setter生成一个修改的对象,而不是调用对象的更改。
    • 有两个OBO解析器

    要解决此问题,请使用:

    conf = conf.setBannedParsers(
        "org.coode.owlapi.obo12.parser.OBO12ParserFactory org.semanticweb.owlapi.oboformat.OBOFormatOWLAPIParserFactory");
    
    manager.getOntologyConfigurator().withBannedParsers("...");
    
    OWLOntologyDocumentSource source = 
        new FileDocumentSource(fileName, new NTriplesDocumentFormat());
    

    第二个更新:POM有多个包含解析器列表的JAR。尝试替换

    <dependency>
      <groupId>net.sourceforge.owlapi</groupId>
      <artifactId>owlapi-distribution</artifactId>
      <version>4.3.1</version>
    </dependency>
    
    <dependency>
      <groupId>net.sourceforge.owlapi</groupId>
      <artifactId>owlapi-rio</artifactId>
      <version>4.3.1</version>
    </dependency>
    
    <dependency>
      <groupId>net.sourceforge.owlapi</groupId>
      <artifactId>owlapi-compatibility</artifactId>
      <version>4.3.1</version>
    </dependency>
    

    <dependency>
      <groupId>net.sourceforge.owlapi</groupId>
      <artifactId>owlapi-osgidistribution</artifactId>
      <version>4.3.1</version>
    </dependency>
    

    更新:我已经签出了您用来复制问题的项目。为了调试它,我尝试一次添加一个OWLAPI依赖项,直到错误停止发生--在这样做的过程中,我发现以下文件的内容:

    问题是,在重新打包owlapi-distribution及其依赖项时,服务内部的文件没有按照预期进行合并。

    你应该可以通过使用shade插件在你的重新打包修复。作为一个示例,我在这里粘贴了Owlapi-Distributation所做的工作--您需要更改排除列表,因为您可能不想排除任何依赖项。

            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-shade-plugin</artifactId>
                <version>2.3</version>
                <executions>
                    <execution>
                        <phase>package</phase>
                        <goals>
                            <goal>shade</goal>
                        </goals>
                        <configuration>
                            <artifactSet>
                                <includes>
                                    <include>net.sourceforge.owlapi:owlapi-api</include>
                                    <include>net.sourceforge.owlapi:owlapi-apibinding</include>
                                    <include>net.sourceforge.owlapi:owlapi-fixers</include>
                                    <include>net.sourceforge.owlapi:owlapi-impl</include>
                                    <include>net.sourceforge.owlapi:owlapi-oboformat</include>
                                    <include>net.sourceforge.owlapi:owlapi-parsers</include>
                                    <include>net.sourceforge.owlapi:owlapi-rio</include>
                                    <include>net.sourceforge.owlapi:owlapi-tools</include>
                                </includes>
                                <excludes>
                                    <exclude>org.apache.felix:org.osgi.core</exclude>
                                    <exclude>org.openrdf.sesame:*</exclude>
                                    <exclude>com.fasterxml.jackson.core:*</exclude>
                                    <exclude>com.github.jsonld-java:*</exclude>
                                    <exclude>com.fasterxml.jackson.core:*</exclude>
                                    <exclude>org.apache.httpcomponents:*</exclude>
                                    <exclude>commons-codec:commons-codec:*</exclude>
                                    <exclude>org.slf4j:*</exclude>
                                    <exclude>org.semarglproject:*</exclude>
                                    <exclude>com.google.guava:*</exclude>
                                    <exclude>com.google.inject:*</exclude>
                                    <exclude>javax.inject:*</exclude>
                                    <exclude>aopalliance:*</exclude>
                                    <exclude>com.google.inject.extensions:*</exclude>
                                    <exclude>com.google.code.findbugs:*</exclude>
                                    <exclude>org.slf4j:slf4j-api</exclude>
                                    <exclude>commons-io:*</exclude>
                                    <exclude>org.tukaani:*</exclude>
                                    <exclude>net.sf.trove4j:*</exclude>
                                </excludes>
                            </artifactSet>
                            <transformers>
                                <transformer
                                    implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer" />
                            </transformers>
                        </configuration>
                    </execution>
                </executions>
            </plugin>
    
     类似资料:
    • 通过对错误类型实现 Display 和 From,我们能够利用上绝大部分标准库错误处理工具。然而,我们遗漏了一个功能:轻松 Box 我们错误类型的能力。 标准库会自动通过 Form 将任意实现了 Error trait 的类型转换成 trait 对象 Box<Error> 的类型(原文:The std library automatically converts any type that imp

    • 问题内容: 如何使用 自定义* 错误处理程序处理 解析 和 致命 错误? * 问题答案: 简单答案:不能。参见手册: 用户定义的函数无法处理以下错误类型:E_ERROR,E_PARSE,E_CORE_ERROR,E_CORE_WARNING,E_COMPILE_ERROR,E_COMPILE_WARNING,以及在调用set_error_handler()的文件中引发的大多数E_STRICT。 对

    • 我需要测试rest api发布JSON。我尝试使用JMeter通过BeanShell后处理器发布Json数据。但是BeanShell后处理器不工作或者我写不出正确的代码。 我的Json是:{“email”:“selin@xxx.com”,“password”:“123”} 你可以在下面的图片中看到我做了什么。

    • 错误处理 我们已经看过几个例子,Promise拒绝——既可以通过有意调用reject(..),也可以通过意外的JS异常——是如何在异步编程中允许清晰的错误处理的。让我们兜个圈子回去,将我们一带而过的一些细节弄清楚。 对大多数开发者来说,最自然的错误处理形式是同步的try..catch结构。不幸的是,它仅能用于同步状态,所以在异步代码模式中它帮不上什么忙: function foo() {

    • 问题内容: 我有以下程序,其中我需要使用以下结构来解析yaml: https://codebeautify.org/yaml- validator/cbabd352 这是 有效的Yaml ,我使用字节使​​其更简单,也许缩进在复制粘贴到问题的过程中已更改,但您可以在链接中看到yaml有效 YAML的有API_VERSION和亚军,每个转轮(关键是名字),我已经命令的列表,我需要打印这些命令和,我究

    • 作为我正在构建的应用程序的一部分,我正在使用csv-parse读取和操作大型(约5.5GB,800万行)csv文件。我让这个过程运行得相对平稳,但我被困在一个项目上——捕捉由不一致的列数引发的错误。 我之所以使用管道函数,是因为它与应用程序的其余部分配合得很好,但我的问题是,如何将解析器抛出的错误重定向到日志并允许该过程继续? 我认识到,我可以使用选项跳过列数不一致的记录,该选项几乎就足够了。问题