简介:雅虎发布的一份各种流处理引擎的基准测试,包括Storm, Flink, Spark Streaming
动机:贴近生产环境,使用Kafka和Redis进行数据获取和存储,设计并实现了一个真实的流处理基准。
论文中的一些测试结果和结论:
原文:The results demonstrate that at fairly high throughput, Storm and Flink have much lower latency than Spark Streaming (whose latency is proportional to throughput rate). On the other hand, Spark Streaming is able to handle higher maximum throughput rate while its performance is quite sensitive to the batch duration setting.
原文:at high-throughput both versions of Storm struggled。
原文:Storm’s acking functionality as of version 0.11.0 incurs enough overhead to be a limitation at very high throughputs, and while processing guarantees require acking, flow control could be achieved via backpressure instead.