当前位置: 首页 > 工具软件 > Jericho > 使用案例 >

jericho-html解析html的一个例子

漆雕修德
2023-12-01
<span style="font-size:18px;">public static void main(String[] args) throws ClientProtocolException, IOException {
		CloseableHttpClient client = HttpClients.createDefault();
		HttpGet get = new HttpGet("http://book.douban.com/latest");
		CloseableHttpResponse response = (CloseableHttpResponse) client.execute(get);
		HttpEntity entity = response.getEntity();
		String content = EntityUtils.toString(entity);
		Source source = new Source(content);
		List<Element> lis = source.getAllElements("li");
		System.out.println("总书数目:"+lis.size());
		List<Element> childList=null;
		for(Element em :lis){
			childList = em.getChildElements();
			if(childList.size()==2&&"div".equals(childList.get(0).getName())&&"a".equals(childList.get(1).getName())){
				Element divElement = childList.get(0);
				String title = divElement.getChildElements().get(0).getTextExtractor().toString();
				String description = divElement.getChildElements().get(1).getTextExtractor().toString();
				String summary = divElement.getChildElements().get(2).getTextExtractor().toString();
				System.out.println("标题:" + title);
				System.out.println("描述:" + description);
				System.out.println("简介:" + summary);
				
				Element aelement = childList.get(1);
				String iconpath = aelement.getChildElements().get(0).getAttributeValue("src");
				System.out.println("图片路径:"+iconpath);
				System.out.println("=======================================================================");
			}
		}</span>

 类似资料: