做一个软件,需要解析HTML,找到了nekohtml这个工具,网上给的例子如下:packagecom.ctlok.pro;importjava.io.IOException;importorg.cyberneko.html.parsers.DOMParser;importorg.do...
做一个软件,需要解析HTML,找到了nekohtml这个工具,网上给的例子如下:
package com.ctlok.pro;
import java.io.IOException;
import org.cyberneko.html.parsers.DOMParser;
import org.dom4j.Document;
import org.dom4j.Node;
import org.dom4j.io.DOMReader;
import org.xml.sax.SAXException;
publicclassMain{
/**
* @param args
*/
publicstaticvoid main(String[] args){
try{
String url ="http://hk.finance.yahoo.com/q?s=0005.HK";
DOMParser parser =newDOMParser();
parser.parse(url);
org.w3c.dom.Document document = parser.getDocument();
DOMReader domReader =newDOMReader();
Document doc = domReader.read(document);
//Element name should be upper case
Node name = doc.selectSingleNode("//DIV[@id='quote-bar-latest']/*/H2/node()");
Node buy = doc.selectSingleNode("//DIV[@id='quote-bar-trade-info']/TABLE/TBODY/TR[1]/TD[2]");
Node sell = doc.selectSingleNode("//DIV[@id='quote-bar-trade-info']/TABLE/TBODY/TR[2]/TD[2]");
System.out.println(name.getText());
System.out.println("Buy: "+ buy.getText().substring(2));
System.out.println("Sell: "+ sell.getText().substring(2));
}catch(SAXException e){
System.out.println(e.toString());
}catch(IOException e){
System.out.println(e.toString());
}
}
}
现在的问题是我从网上下载的 nekohtml源码中,类org.cyberneko.html.parsers.DOMParser里面根本没有parser()和getDocument()这两个方法,是我下载的资源错了吗?头都大了!
没人用过吗?
展开