Netflix近日开源了一个叫做Suro的工具,可以收集来自多个应用服务器的事件数据,并实时定向发送到目标数据平台如Hadoop和Elasticsearch。Netfix的这项创新有望成为大数据主流技术。
Netflix用Suro进行数据源到目标主机的实时导向,Suro不但在Netflix的数据管道中扮演关键角色,而且也是脱胎大型互联网公司的众多开源数据分析工具中的佼佼者。
Netflix各种应用程序每天生成数百亿的事件,Suro可以在数据被发送之前收集到它们,然后部分经过Amazon S3给Hadoop批处理,另一部分经过Apache Kafka给Druid 和 ElasticSearch做实时分析。从Netflix博客了解到,公司也在考虑如何让Suro支持Storm或Samza这样的实时处理引擎去执行事件数据的机器学习。
#FormatImgID_0#
熟悉大数据领域的人们都知道,很多技术都与公司挂钩,例如Netflix 创建了Suro, LinkedIn 创建了Kafka 和 Samza , Twitter创建了Storm,Metamarkets 创建了Druid 。Suro博客也承认它是基于Apache Chukwa项目,类似 Apache的Flume , Facebook的Scribe 。诚然,这些项目中最显著的无疑是Hadoop。
公司为什么要建立自己的技术一直是争议的热点,因为他们的需求,一般都会被创建,就像在生活中的很多东西,不过,这个问题的答案还得具体问题具体分 析。例如Storm,正在成为一个非常受欢迎的流处理工具,但LinkedIn觉得它需要不同的东西,因此创建Samza。取代使用一些已有的技 术,Netflix创建了Suro,主要因为该公司虽然是一个重度的云服务用户(主要基于AWS),但也有一些非AWS业务,包括Apache Cassandra数据库。
bjbzj.codeplex.com;
shbjz.codeplex.com; tjbjz.codeplex.com; cqbjz.codeplex.com; hebbjz.codeplex.com; jlbjz.codeplex.com; sybjz.codeplex.com; dlbjz.codeplex.com; asbjz.codeplex.com; jnbjz.codeplex.com; qdbjz.codeplex.com; zbbjz.codeplex.com; dybjz.codeplex.com; ytbjz.codeplex.com; wfbjz.codeplex.com; tybjz.codeplex.com; xamjz.codeplex.com; sjzmt.codeplex.com; tsmt.codeplex.com; qhdmt.codeplex.com; lybjz.codeplex.com; zzmt.codeplex.com; whmt.codeplex.com; csbjz.codeplex.com; wxbjz.codeplex.com; njbjz.codeplex.com; szbjz.codeplex.com; nbbjz.codeplex.com; hzbjz.codeplex.com; fzbjz.codeplex.com; xmbjz.codeplex.com; kmbjz.codeplex.com; cdbjz.codeplex.com; hfbjz.codeplex.com; gybjz.codeplex.com; xzbjz.codeplex.com; ntbjz.codeplex.com; jsbjz.codeplex.com; zjbjz.codeplex.com; yzbjz.codeplex.com; ncbjz.codeplex.com; gzbjz.codeplex.com; szjzb.codeplex.com; zsbjz.codeplex.com; stbjz.codeplex.com; fsbjz.codeplex.com; lzbjz.codeplex.com; zhbjz.codeplex.com; hzmt.codeplex.com; ycbjz.codeplex.com; xnbjz.codeplex.com; whjz.codeplex.com; czmt.codeplex.com; bjbjz.codeplex.com; shmt.codeplex.com; tjjz.codeplex.com; cqjz.codeplex.com; hebjz.codeplex.com; jljz.codeplex.com; syjz.codeplex.com; dljz.codeplex.com; asjz.codeplex.com; jnjz.codeplex.com; qdjz.codeplex.com; zbjz.codeplex.com; dyjz.codeplex.com; ytjz.codeplex.com; wfjz.codeplex.com; tyjz.codeplex.com; xajz.codeplex.com; sjbjz.codeplex.com; tsjz.codeplex.com; qhbjz.codeplex.com; lyjz.codeplex.com; zzjz.codeplex.com; whbz.codeplex.com; csjz.codeplex.com; wxjz.codeplex.com; njjz.codeplex.com; szjz.codeplex.com; nbjz.codeplex.com; hzjz.codeplex.com; fzjz.codeplex.com; xmjz.codeplex.com; kmjz.codeplex.com; cdjz.codeplex.com; hfjz.codeplex.com; gyjz.codeplex.com; xzjz.codeplex.com; ntjz.codeplex.com; jhjz.codeplex.com; zjjz.codeplex.com; yzjz.codeplex.com; ncjz.codeplex.com; gzjz.codeplex.com; szgjz.codeplex.com; zsjz.codeplex.com; stjz.codeplex.com; fsjz.codeplex.com; lzjz.codeplex.com; zhjz.codeplex.com; fzmjz.codeplex.com; ycjz.codeplex.com; xnjz.codeplex.com; whmjz.codeplex.com; czjz.codeplex.com;
这场技术创新最终赢家必然归结于采用这些主流技术的用户,无需在公司内部招聘专业人士,就可让公司从这些开源技术中获益。例如,我们已经看到 Hadoop供应商试图让Storm和Spark框架用于其企业客户。同时,我们也相信Hadoop绝对不是最后一个这样的技术。AWS有非常多的用户, 毕竟他们希望Suro这样技术提供的能力,而不是被AWS推出的服务绑定。
|