当前位置: 首页 > 工具软件 > JTidy > 使用案例 >

使用JTidy将html转化成xhtml

王棋
2023-12-01
 
在做中间件平台的时候,其中有一步就是将html转化成xhtml,这部分由lj负责,于是乎,跑去问了问lj,得知她采用的是JTidy的一个jar文件,于是在网上搜了一下,稍作修改,完成功能是,输入文件为html,输出文件为xhtml.

具体实现如下:

(1)构建路径,引入相应的jar文件jtidy-r938.jar

(2)代码如下:

 

 package beans;
 import java.io.*;
 import java.text.*;
 import java.util.*;
 import java.net.*; 
import org.w3c.tidy.Configuration;
import org.w3c.tidy.Tidy;

 public class test
 { 
    public static void main(String args[])
    {         
           testt = newtest();             
           t.doTidy("c:\\hopetest\\b.html");//转化开始            
    }   
  public void doTidy(Stringf_in)
    {        
        BufferedInputStream sourceIn; //输入流
        ByteArrayOutputStream tidyOutStream; //输出流
        try
        {          
            Reader reader;  
            
            FileInputStream  fis  =  new   FileInputStream(f_in);//读文件
            ByteArrayOutputStream  bos  =  new  ByteArrayOutputStream();  
            int  ch;  
            while((ch=fis.read())!=-1)  
            {  
                    bos.write(ch);  
            }  
            fis.close();  
            byte[]  bs  =  bos.toByteArray();  
            bos.close();  
            String hope_gb2312=new String(bs,"GB2312");//注意,默认是GB2312,所以这里先转化成GB2312然后再转化成其他的。            
            byte[] hope_b=hope_gb2312.getBytes();
            String basil=new String(hope_b,"utf-8");//将GB2312转化成UTF-8             
            byte[]basil_b=basil.getBytes();           
            ByteArrayInputStream stream = newByteArrayInputStream(basil.getBytes());
            tidyOutStream = new ByteArrayOutputStream();
            Tidy tidy = new Tidy();
            tidy.setInputEncoding("UTF-8");
            tidy.setQuiet(true);                   
            tidy.setOutputEncoding("UTF-8");          
            tidy.setShowWarnings(false); //不显示警告信息
            tidy.setIndentContent(true);//
            tidy.setSmartIndent(true);
            tidy.setIndentAttributes(false);
            tidy.setWraplen(1024); //多长换行
            //输出为xhtml
            tidy.setXHTML(true);
             tidy.setErrout(newPrintWriter(System.out));
            tidy.parse(stream, tidyOutStream);
            DataOutputStream  to=new  DataOutputStream(new  FileOutputStream("C:\\hopetest\\bb.xhtml"));  //将生成的xhtml写入 
            tidyOutStream.writeTo(to);
            System.out.println(tidyOutStream.toString());

        }
        catch ( Exception ex )
        {
            System.out.println( ex.toString());
            ex.printStackTrace();
        }
    } 

 } 

 

 类似资料: