在并行流上顺序调用使所有先前的操作顺序

卫浩瀚

2023-03-14

问题内容：

我有大量数据，并且想要调用缓慢但干净的方法，而不是调用带有第一个结果的副作用的快速方法。我对中间结果不感兴趣，所以我不想收集它们。

明显的解决方案是创建并行流，进行慢速调用，再次使流顺序进行，然后进行快速调用。问题是，所有代码都在单个线程中执行，没有实际的并行性。

示例代码：

@Test
public void testParallelStream() throws ExecutionException, InterruptedException
{
    ForkJoinPool forkJoinPool = new ForkJoinPool(Runtime.getRuntime().availableProcessors() * 2);
    Set<String> threads = forkJoinPool.submit(()-> new Random().ints(100).boxed()
            .parallel()
            .map(this::slowOperation)
            .sequential()
            .map(Function.identity())//some fast operation, but must be in single thread
            .collect(Collectors.toSet())
    ).get();
    System.out.println(threads);
    Assert.assertEquals(Runtime.getRuntime().availableProcessors() * 2, threads.size());
}

private String slowOperation(int value)
{
    try
    {
        Thread.sleep(100);
    }
    catch (InterruptedException e)
    {
        e.printStackTrace();
    }
    return Thread.currentThread().getName();
}

如果我删除sequential，代码将按预期执行，但是很明显，非并行操作将在多个线程中调用。

您能推荐一些有关这种行为的参考，或者某种避免临时收集的方法吗？

问题答案：

在最初的Stream
API设计中，将流从切换parallel()为sequential()工作，但引起了许多问题，最终实现被更改，因此它只是打开和关闭整个管道的并行标志。当前文档确实含糊不清，但是在Java-9中进行了改进：

根据在其上调用终端操作的流的模式，顺序或并行执行流管道。可以使用该BaseStream.isParallel()方法确定流的顺序或并行模式，并可以使用BaseStream.sequential()和BaseStream.parallel()操作修改流的模式。最新的顺序或并行模式设置适用于整个流管道的执行。

对于您的问题，您可以将所有内容收集到中间层List并启动新的顺序管道：

new Random().ints(100).boxed()
        .parallel()
        .map(this::slowOperation)
        .collect(Collectors.toList())
        // Start new stream here
        .stream()
        .map(Function.identity())//some fast operation, but must be in single thread
        .collect(Collectors.toSet());

在并行流上顺序调用使所有先前的操作顺序

相关阅读

相关文章

相关问答

相关工具

相关文档