累积Java流，然后再对其进行处理

管梓

2023-03-14

问题内容：

我有一个看起来像下面的文档：

data.txt

100, "some text"
101, "more text"
102, "even more text"

我使用正则表达式对其进行了处理，并返回了如下的新处理文档：

Stream<String> lines = Files.lines(Paths.get(data.txt);
Pattern regex = Pattern.compile("([\\d{1,3}]),(.*)");

List<MyClass> result = 
  lines.map(regex::matcher)
       .filter(Matcher::find)
       .map(m -> new MyClass(m.group(1), m.group(2)) //MyClass(int id, String text)
       .collect(Collectors.toList());

这将返回已处理的MyClass的列表。可以并行运行，一切正常。

问题是我现在有这个：

data2.txt

101, "some text
the text continues in the next line
and maybe in the next"
102, "for a random
number
of lines"
103, "until the new pattern of new id comma appears"

因此，我需要以某种方式加入正在从流中读取的行，直到出现新的匹配项为止。（有点像缓冲区吗？）

我尝试收集字符串，然后收集MyClass（），但没有成功，因为我实际上无法拆分流。

减少连接线的想法，但是我只能连接线，而不能减少并生成新的线流。

有什么想法如何用Java 8 Streams解决这个问题吗？

问题答案：

这是的工作java.util.Scanner。随着即将发布的Java 9，您将编写：

List<MyClass> result;
try(Scanner s=new Scanner(Paths.get("data.txt"))) {
    result = s.findAll("(\\d{1,3}),\\s*\"([^\"]*)\"")
                //MyClass(int id, String text)
    .map(m -> new MyClass(Integer.parseInt(m.group(1)), m.group(2))) 
    .collect(Collectors.toList());
}
result.forEach(System.out::println);

但是由于Stream生产findAll在Java 8中不存在，因此我们需要一个辅助方法：

private static Stream<MatchResult> matches(Scanner s, String pattern) {
    Pattern compiled=Pattern.compile(pattern);
    return StreamSupport.stream(
        new Spliterators.AbstractSpliterator<MatchResult>(1000,
                         Spliterator.ORDERED|Spliterator.NONNULL) {
        @Override
        public boolean tryAdvance(Consumer<? super MatchResult> action) {
            if(s.findWithinHorizon(compiled, 0)==null) return false;
            action.accept(s.match());
            return true;
        }
    }, false);
}

findAll用这种辅助方法代替，我们得到

List<MyClass> result;
try(Scanner s=new Scanner(Paths.get("data.txt"))) {

    result = matches(s, "(\\d{1,3}),\\s*\"([^\"]*)\"")
               // MyClass(int id, String text)
    .map(m -> new MyClass(Integer.parseInt(m.group(1)), m.group(2)))
    .collect(Collectors.toList());
}

累积Java流，然后再对其进行处理

相关阅读

相关文章

相关问答

相关工具

相关文档