One of the key values at Facebook is to movefast. For the past six years, we have been able to accomplish a lot thanks torapid pace of development that PHP offers. As a programming language, PHP issimple. Simple to learn, simple to write, simple to read, and simple to debug.We are able to get new engineers ramped up at Facebook a lot faster with PHPthan with other languages, which allows us to innovate faster.
在Facebook,一个很重要的观念就是快速执行.在过去的六年里,我们一直为PHP取得如此快速的发展表示感激.作为一门程序设计语言,PHP很简单,易学,易写,易读而且还容易调试.因此我们通过新的PHP工程师使得Facebook提升更快,创新更迅速.
Today I'm excited to share the project asmall team of amazing people and I have been working on for the past two years;HipHop for PHP. With HipHop we've reduced the CPU usage on our Web servers onaverage by about fifty percent, depending on the page. Less CPU means fewerservers, which means less overhead. This project has had a tremendous impact onFacebook. We feel the Web at large can benefit from HipHop, so we are releasingit as open source this evening in hope that it brings a new focus towardscaling large complex websites with PHP. While HipHop has shown us incredibleresults, it's certainly not complete and you should be comfortable with betasoftware before trying it out.
今天,我很高兴和大家分享一个项目—HipHop ForPHP,它是由一小群令人惊异的成员包括我在过去两年内一直研究开发的.通过使用HipHop我们成功的将处理网页的网络服务器的CPU使用率降低了50%,更少的CPU意味着更少的服务器,也就意味着更少的开支。这个项目对于Facebook有着很重要的意义。我们认为facebook整个网络都可以从中受益,因此我们今晚以开源的方式发布并且希望它能够给成为那些基于PHP开发的复杂网站扩展和优化的焦点。尽管HipHop已经取得了很不错的反响,但是它目前还不完整,因此在真正对它满意之前还需要一些的尝试。
HipHop for PHP isn't technically a compileritself. Rather it is a source code transformer. HipHop programmaticallytransforms your PHP source code into highly optimized C++ and then uses g++ tocompile it. HipHop executes the source code in a semantically manner and sacrifices some rarely usedfeatures — such as eval() — in exchange for improved performance. HipHopincludes a code transformer, a of PHP's runtimesystem, and a rewrite of many common PHP Extensions to take advantage of theseperformance optimizations.
HipHop for PHP从技术上讲它本身不是一个编译器,相反,它是一个源代码转换器,它程序化的将你的PHP代码转换成优化过的C++然后使用G++来编译。HipHop以一种语义等价的方式执行源代码,并且为了更好的执行它去除了一些很少用的特征,比如eval函数。HipHop的组成由,一个代码转换器,重新实现的PHP运行时系统,和为提高执行效率对重写的PHP扩展程序。
PHP's roots are those of a scriptinglanguage, like Perl,Python, and Ruby, all of which have major benefits in terms of programmerproductivity and the ability to iterate quickly on products. This is compared tomore traditional compiled languages like C++ and interpretedlanguages like Java.On the other hand, scripting languages are known to generally be less efficientwhen it comes to CPU and memory usage. Because of this, it's been challengingto scale Facebook to over 400 billion PHP-based page views every month.
PHP 源自一些脚本语言,比如Perl,Python和Ruby,相对于传统的编译执行的语言如C++和解释执行的语言如Java来说,它们的最大的好处就是提高了程序员的开发产品的效率。但是另一方面,脚本语言在CPU和内存的使用方面表现的不是那么高效。而现在我们面临的挑战就是facebook每个月拥有4000亿的基于PHP的网页访问。
One common way to address theseinefficiencies is to rewrite the more complex parts of your PHP applicationdirectly in C++ as PHP Extensions. This largely transforms PHP into a gluelanguage between your front end HTML and application logic in C++. From atechnical perspective this works well, but drastically reduces the number ofengineers who are able to work on your entire application. Learning C++ is onlythe first step to writing PHP Extensions, the second is understanding the Zend APIs. Given that our engineering team isrelatively small — there are over one million users to every engineer — wecan't afford to our codebase lessaccessible than others.
一种常用的处理低效率的方法就是将你的复杂的PHP应用部分直接用C++来作为PHP的扩展,这个如此大的转变使得PHP成为了连接你的前台HTML语言和后台逻辑语言C++的胶水。从技术的观点来看,这样是可以的,但是能让这样的应用运行的工程师目前很少,学习C++仅仅是你写PHP扩展的第一步,第二步则是你要研究透Zend的API,考虑到我们的工程组相对来说较小,每个工程师负责的都不轻松,我们不能像上面那样分割代码。
Scaling Facebook is particularly challenging becausealmost every page view is a logged-in user with a customized experience. Whenyou view your home page we need to look up all of your friends, query theirmost relevant updates (from a custom service we've built called Multifeed),filter the results based on your privacy settings, then fill out the storieswith comments, photos, likes, and all the rich data that people love aboutFacebook. All of this in just under a second. HipHop allows us to write thelogic that does the final page assembly in PHP and iterate it quickly whilerelying on custom back-end services in C++, Erlang, Java, or Python to servicethe News Feed, search, Chat, and other core parts of the site.
为Facebook减负确实是一件非常有挑战的事,因为几乎每个页面的外观都是登录用户定制的。当你浏览你的主页时,我们需要查看你所有的好友,查询他们的一些更新(通过一个我们开发的叫做“多重订阅”的定制的服务),根据你的隐私政策筛选结果,然后对故事进行评价,贴图,赞和添加一些其他的关于人们对facebook的富媒体数据。所有的这些都在1秒内处理,HipHop允许我们使用PHP书写逻辑处理然后如果使用基于C++,Erlang,Java或者是Python的后台服务,它会很快迭代来处理新闻订阅,搜索,聊天,以及网站的其他核心。
Since 2007 we've thought about a few different ways tosolve these problems and have even tried implementing a few of them. The commonsuggestion is to just rewrite Facebook in another language, but given thecomplexity and speed of development of the site this would take some time toaccomplish. We've rewritten aspects of the ZendEngine — PHP's internals — and contributed thosepatches back into the PHP project, but ultimately haven't seen the sort ofperformance increases that are needed. HipHop's benefits are nearly transparentto our development speed.
自从2007年我们想出了很多解决这些问题的方法,,并且也尝试使用了其中的一些,最通常的建议是干脆用另一种语言重写Facebook,但是考虑到网站的复杂程度和发展,如果这样做会花费比较长的一段时间,我们已经重写了PHP内核 Zend Engine(Zend 引擎),并且对PHP进行了修复,但是最好还是不尽人意,HipHop成了唯一可行的办法。
Hacking Up HipHop
想法诞生
One night at a Hackathon a few years ago (see Prime Time Hack), I started my first piece of code transforming PHP intoC++. The languages are fairly similar syntactically and C++ drasticallyoutperforms PHP when it comes to both CPU and memory usage. Even PHP itself iswritten in C. We knew that it was impossible to successfully rewrite an entirecodebase of this size by hand, but wondered what would happen if we built asystem to do it programmatically.
几年前的在黑客松的一个晚上,我第一次开始写了一段将PHP转换成C++的代码,这两种语言在语法上计划就是相似的,但是当涉及到CPU和内存使用情况是,C++表现的要比PHP好,尽管PHP使用C写的。我们知道如果手工的将这些代码成功的重写是不可能的,但是但是我想知道如果我们构建一个系统可以程序化的转换那会怎么样。
Finding new ways to improve PHP performance isn't a newconcept. At run time the Zend Engine turns your PHP source into which are then run through the Zend Virtual Machine. Open source projects suchas APC and eAccelerator cache this output and are used by themajority of PHP powered websites. There's also Zend Server, a commercial product which makes PHP faster via opcodeoptimization and caching. Instead, we were thinking about transforming PHPsource directly into C++ which can then be turned into native machine code.Even compiling PHP isn't a new idea, open source projects like Roadsend and phc compile PHP to C, Quercus compiles PHP to Java,and Phalangercompiles PHP to .Net.
发现新方法来提高PHP的执行已经不是一个新的观念,在运行的时候Zend引擎将你的PHP代码转换成opcodes(PHP的一个中间语言,类似Java的字节码),然后opcodes通过Zend 虚拟机运行,像APC和eAccelerator这些开源项目就缓存这些opcodes输出,并且这些已经被绝大多数由PHP开发的网站使用。当然还有Zend Server,一款商业的通过优化opcode和缓存来加速PHP执行的技术。但是我们考虑的是将PHP代码直接转换成C++代码,然后C++代码转换成本机代码。虽然编译转换PHP代码的想法早已有人提出,比如开源项目中的将PHP代码转换成C代码的Roadsend和phc,将PHP代码转换成Java代码的Quercus和将PHP代码转换成.net代码的Phalanger
Needless to say, it took longer than that singleHackathon. Eight months later, I had enough code to demonstrate it is indeedpossible to run faster with compiled code. We quickly added Iain Proctor andMinghui Yang to the team to speed up the pace of the project. We spent the nextten months finishing up all the coding and the following six months testing onproduction servers. We are proud to say that at this point, we are serving over90% of our Web traffic using HipHop, all only six months after deployment.
不用说,实现起来比单纯的想法难得多,八个月后,我写了足够多的代码演示使用编译的代码来提高执行速度是可以实现的。为了加快项目进度我们很快将Iain Proctor和Minghui Yang加入到我们的项目组里。我们花费接下来的十个月完成了所有的编码工作,接下来的6个月我们开始测试产品服务器,我们在这方面可以自豪地说,目前HipHop处理着Facebook 的90%的网页流量。所有的部署将在六个月内完成。
How HipHop Works
HipHop运行机制
The main challenge of the project was bridging the gapbetween PHP and C++. PHP is a scripting language with dynamic, weak typing. C++is a compiled language with static typing. While PHP allows you to writemagical dynamic features, most PHP is relatively straightforward. It's morelikely that you see if (...) {...} else{..} than it is to see function foo($x) { include $x; }. This is where we gainin performance. Whenever possible our generated code uses static binding forfunctions and variables. We also use type inference to pick the most specifictype possible for our variables and thus save memory.
项目中最大的困难就是在PHP与C++之间建立联系,PHP是一种弱类型的的动态脚本语言,而C++是一种强类型的编译执行语言。尽管PHP可以允许我们使用魔术方法动态方法等特性,但是大多数PHP使用相对来说是简单的,比如相对于function foo($x){include$x;}你常见的是if(…){…}else{…}这就是我们提高效率的切入点,只要有可能我们就将我们生成的代码对函数和变量进行静态绑定。我们还是用类型推断来和我们的变量条件最必配的类型,以此节省内存。
The transformation process includes three main steps:
转换过程包括以下三个主要步骤
1. Static analysis where wecollect information on who declares what and dependencies,
静态分析我们从哪里收集谁声明的哪些些信息和信息的依赖性
2. Type inference where wechoose the most specific type between C++ scalars, String, Array, classes,Object, and Variant, and so on!
类型推理 从哪里我们选择从C++String Array,Class,Object等中选择出最符合条件的类型
3. Code generation whichfor the most part is a direct correspondence from PHP statements andexpressions to C++ statements and expressions.
代码生成,将绝大多数从PHP语句和表达式转换成C++语句和表达式。
We have alsodeveloped HPHPi, which is an experimental interpreter designed for development.When using HPHPi you don't need to compile your PHP source code before runningit. It's helped us catch bugs in HipHop itself and provides engineers a way touse HipHop without changing how they write PHP.
我们还开发了HPHPi,一种实验用的用于开发的解析器,当使用HPHPi时,你在运行前不需要编译你的PHP源码,它帮助我们捕捉HipHop自身的bug,并且让我们的工程师们不用改变书写PHP的方式就可以使用HipHop。
Overall HipHop allowsus to keep the best aspects of PHP while taking advantage of the performancebenefits of C++. In total, we have written over 300,000 lines of code and morethan 5,000 unit tests. All of this will be released this evening on GitHubunder the open source PHP license.
总的来说,HipHop允许我们保留PHP最好的方面同时也可以受益C++的优点。我们已经写了总共30万行代码并且进行了超过5000次单元测试,所有的这些都将在今晚发布在获得开源PHP证书的情况下发布到GitHub上。
名词解释:
Hackthon:Hackathon,就是集合公司的所有开发人员(也可以包括一些相关人员),用一整天(并且只用一天)的时间,确定一些开发计划并将其实现。具体请参考http://chinaonrails.com/topic/view/39.html
由于时间仓促,加之本人能力有限,如果那里有不对的地方请大家指出,英文原文略有删除。