Java-读取文件并拆分为多个文件

汝才良

2023-03-14

问题内容：

我有一个文件，我想用Java读取并将其拆分为n（用户输入）输出文件。这是我读取文件的方式：

int n = 4;
BufferedReader br = new BufferedReader(new FileReader("file.csv"));
try {
    String line = br.readLine();

    while (line != null) {
        line = br.readLine();
    }
} finally {
    br.close();
}

如何将文件拆分file.csv为n文件？

注意-由于文件中的条目数约为100k，因此我无法将文件内容存储到数组中，然后将其拆分并保存到多个文件中。

问题答案：

由于一个文件可能很大，因此每个拆分文件也可能很大。

例：

源文件大小：5GB

数字分割：5：目的地

档案大小：每个1GB（5个档案）

即使我们有这样的内存，也无法一口气读取这个大的拆分块。基本上，对于每个拆分，我们都可以读取一个固定大小byte- array，我们知道该大小在性能以及内存方面都是可行的。

NumSplits：10个MaxReadBytes：8KB

public static void main(String[] args) throws Exception
    {
        RandomAccessFile raf = new RandomAccessFile("test.csv", "r");
        long numSplits = 10; //from user input, extract it from args
        long sourceSize = raf.length();
        long bytesPerSplit = sourceSize/numSplits ;
        long remainingBytes = sourceSize % numSplits;

        int maxReadBufferSize = 8 * 1024; //8KB
        for(int destIx=1; destIx <= numSplits; destIx++) {
            BufferedOutputStream bw = new BufferedOutputStream(new FileOutputStream("split."+destIx));
            if(bytesPerSplit > maxReadBufferSize) {
                long numReads = bytesPerSplit/maxReadBufferSize;
                long numRemainingRead = bytesPerSplit % maxReadBufferSize;
                for(int i=0; i<numReads; i++) {
                    readWrite(raf, bw, maxReadBufferSize);
                }
                if(numRemainingRead > 0) {
                    readWrite(raf, bw, numRemainingRead);
                }
            }else {
                readWrite(raf, bw, bytesPerSplit);
            }
            bw.close();
        }
        if(remainingBytes > 0) {
            BufferedOutputStream bw = new BufferedOutputStream(new FileOutputStream("split."+(numSplits+1)));
            readWrite(raf, bw, remainingBytes);
            bw.close();
        }
            raf.close();
    }

    static void readWrite(RandomAccessFile raf, BufferedOutputStream bw, long numBytes) throws IOException {
        byte[] buf = new byte[(int) numBytes];
        int val = raf.read(buf);
        if(val != -1) {
            bw.write(buf);
        }
    }

Java-读取文件并拆分为多个文件

相关阅读

相关文章

相关问答

相关工具

相关文档