当前位置: 首页 > 工具软件 > cb4j > 使用案例 >

DOM4j解析XMl中碰到的问题解决

岳正浩
2023-12-01

引言: DOM4j是java中最为流行的xml解析类库,在使用过程中,碰到了若干问题,这里记录一下,方便后续查询。

1.  DOM的版本以及maven引用

     <dependency>
		<groupId>dom4j</groupId>
		<artifactId>dom4j</artifactId>
		<version>1.6.1</version>
     </dependency>
2.  使用的代码示例
   String msg = msg = "<xml><ToUserName>t6504091</ToUserName><FromUserName>dance-note</FromUserName><CreateTime>111</CreateTime><MsgType>text</MsgType><Content>111</Content></xml>";
   SAXReader reader = new SAXReader();
   long startTime = System.currentTimeMillis();
	 
   Document doc = reader.read(msg);  
   String xpath = "//*[local-name()='xml']/*[local-name()='MsgType']";
		
   @SuppressWarnings("unchecked")
   List<Element> msgList = doc.selectNodes(xpath);
		
   //System.out.println("startTime:" + startTime);
		
    String msgType = msgList.get(0).getStringValue();
		
    long endTime = System.currentTimeMillis();
    long lastTime = endTime - startTime;
		
    System.out.println("MsgType:" + msgType + " takes " + lastTime + " milseconds");
3.  其中碰到错误以及解决方法

  错误信息1:

org.dom4j.DocumentException: no protocol: <xml><ToUserName>t6504091</ToUserName><FromUserName>dance-note</FromUserName><CreateTime>111</CreateTime><MsgType>text</MsgType><Content>111</Content></xml> Nested exception: no protocol: <xml><ToUserName>t6504091</ToUserName><FromUserName>dance-note</FromUserName><CreateTime>111</CreateTime><MsgType>text</MsgType><Content>111</Content></xml>
	at org.dom4j.io.SAXReader.read(SAXReader.java:484)
	at org.dom4j.io.SAXReader.read(SAXReader.java:321)
	at com.rain.wx.skill.UserMessageTest.testMessage(UserMessageTest.java:38)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:497)
	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
	at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
	at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
	at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:86)
	at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:459)
	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:678)
	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:382)
	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:192)
Nested exception: 
java.net.MalformedURLException: no protocol: <xml><ToUserName>t6504091</ToUserName><FromUserName>dance-note</FromUserName><CreateTime>111</CreateTime><MsgType>text</MsgType><Content>111</Content></xml>
	at java.net.URL.<init>(URL.java:586)
	at java.net.URL.<init>(URL.java:483)
	at java.net.URL.<init>(URL.java:432)
	at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(XMLEntityManager.java:619)
	at com.sun.org.apache.xerces.internal.impl.XMLVersionDetector.determineDocVersion(XMLVersionDetector.java:189)
	at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:812)
	at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:777)
	at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141)
	at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1213)
	at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:649)
	at org.dom4j.io.SAXReader.read(SAXReader.java:465)
	at org.dom4j.io.SAXReader.read(SAXReader.java:321)
	at com.rain.wx.skill.UserMessageTest.testMessage(UserMessageTest.java:38)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:497)
	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
	at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
	at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
	at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:86)
	at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:459)
	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:678)
	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:382)
	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:192)
  解决办法:

     将reader中读入的msg进行一下转换:

      原来的Java语句:Document doc = reader.read(msg);

      更新之后的Java语句: Document doc = reader.read(new ByteArrayInputStream(msg.getBytes("UTF-8")));

 错误信息2:

java.lang.NoClassDefFoundError: org/jaxen/NamespaceContext
	at org.dom4j.DocumentFactory.createXPath(DocumentFactory.java:230)
	at org.dom4j.tree.AbstractNode.createXPath(AbstractNode.java:207)
	at org.dom4j.tree.AbstractNode.selectNodes(AbstractNode.java:164)
	at com.rain.wx.skill.UserMessageTest.testMessage(UserMessageTest.java:44)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:497)
	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
	at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
	at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
	at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:86)
	at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:459)
	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:678)
	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:382)
	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:192)
Caused by: java.lang.ClassNotFoundException: org.jaxen.NamespaceContext
	at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
	... 27 more
解决办法:

  jaxen是xpath的实现类库,这里只需要新增依赖包即可。

        <dependency>
		<groupId>jaxen</groupId>
		<artifactId>jaxen</artifactId>
		<version>1.1.6</version>
	</dependency>
4.  Jackson XML Mapping和Dom4j的解析性能对比

  jackson xml的代码:

       String msg = "<xml><ToUserName>t6504091</ToUserName><FromUserName>dance-note</FromUserName><CreateTime>111</CreateTime><MsgType>text</MsgType><Content>111</Content></xml>";
       ObjectMapper mapper1 = new XmlMapper();
       mapper1.disable(DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES);
       BaseIncomingMessage message = mapper1.readValue(msg, BaseIncomingMessage.class);
       long endTime1 = System.currentTimeMillis();
   解析同样的文件:

   Jackson一般用时70~80 ms。  而Dom4j基于xpath则需要133ms。综合而言:jackson相比dom4j而言,还是略胜一筹。

5.  参考资料

  •   http://www.cnblogs.com/mouse-coder/p/3451243.html
 类似资料: