蝉游记( [url]http://chanyouji.com[/url] )网站之前用Nginx+Passenger+自制script来部署,随着用户增多,移动app的api调用增加,服务器增多和无缝部署重启的需求,转移到了Nginx+Unicorn+Capistrano,写篇博客记录一下各种细节和需要注意的地方。
1. Nginx的配置
gzip on;
#开启gzip,同时对于api请求的json格式也开启gzip
gzip_types application/json;
#每台机器都运行nginx+unicorn,本机用domain socket,方便切换
upstream ruby_backend {
server unix:/tmp/unicorn.sock fail_timeout=0;
server 10.4.8.34:4096 fail_timeout=0;
server 10.4.3.8:4096 fail_timeout=0;
}
#用try_files方式和proxy执行rails动态请求
server {
listen 80;
server_name chanyouji.com;
root /www/youji_deploy/current/public;
try_files $uri/index.html $uri.html $uri @httpapp;
location @httpapp {
proxy_redirect off;
proxy_set_header Host $host;
proxy_set_header X-Forwarded-Host $host;
proxy_set_header X-Forwarded-Server $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_buffering on;
proxy_pass http://ruby_backend;
}
}
#用不同的域名提供静态资源服务,减少主域名带来的cookie请求和方便做cdn源
server {
listen 80;
server_name cdn.chanyouji.cn cdnsource.chanyouji.cn;
root /www/youji_deploy/current/public;
location ~ ^/(assets)/ {
root /www/youji_deploy/current/public;
gzip_static on; # to serve pre-gzipped version
expires max;
add_header Cache-Control public;
}
}
2 unicorn.rb的配置
worker_processes 6
app_root = File.expand_path("../..", __FILE__)
working_directory app_root
# Listen on fs socket for better performance
listen "/tmp/unicorn.sock", :backlog => 64
listen 4096, :tcp_nopush => false
# Nuke workers after 30 seconds instead of 60 seconds (the default)
timeout 30
# App PID
pid "#{app_root}/tmp/pids/unicorn.pid"
# By default, the Unicorn logger will write to stderr.
# Additionally, some applications/frameworks log to stderr or stdout,
# so prevent them from going to /dev/null when daemonized here:
stderr_path "#{app_root}/log/unicorn.stderr.log"
stdout_path "#{app_root}/log/unicorn.stdout.log"
# To save some memory and improve performance
preload_app true
GC.respond_to?(:copy_on_write_friendly=) and
GC.copy_on_write_friendly = true
# Force the bundler gemfile environment variable to
# reference the Сapistrano "current" symlink
before_exec do |_|
ENV["BUNDLE_GEMFILE"] = File.join(app_root, 'Gemfile')
end
before_fork do |server, worker|
# 参考 http://unicorn.bogomips.org/SIGNALS.html
# 使用USR2信号,以及在进程完成后用QUIT信号来实现无缝重启
old_pid = app_root + '/tmp/pids/unicorn.pid.oldbin'
if File.exists?(old_pid) && server.pid != old_pid
begin
Process.kill("QUIT", File.read(old_pid).to_i)
rescue Errno::ENOENT, Errno::ESRCH
# someone else did our job for us
end
end
# the following is highly recomended for Rails + "preload_app true"
# as there's no need for the master process to hold a connection
defined?(ActiveRecord::Base) and
ActiveRecord::Base.connection.disconnect!
end
after_fork do |server, worker|
# 禁止GC,配合后续的OOB,来减少请求的执行时间
GC.disable
# the following is *required* for Rails + "preload_app true",
defined?(ActiveRecord::Base) and
ActiveRecord::Base.establish_connection
end
3. GC OOB
这篇newrelic的文章解释很清楚: http://blog.newrelic.com/2013/05/28/unicorn-rawk-kick-gc-out-of-the-band/
就是将GC延迟到用户请求完成以后,这样就会缩短响应时间,配合现成的gem unicorn-worker-killer 也不用担心内存爆掉。
在config.ru里面配置:
require 'unicorn/oob_gc'
require 'unicorn/worker_killer'
#每10次请求,才执行一次GC
use Unicorn::OobGC, 10
#设定最大请求次数后自杀,避免禁止GC带来的内存泄漏(3072~4096之间随机,避免同时多个进程同时自杀,可以和下面的设定任选)
use Unicorn::WorkerKiller::MaxRequests, 3072, 4096
#设定达到最大内存后自杀,避免禁止GC带来的内存泄漏(192~256MB之间随机,避免同时多个进程同时自杀)
use Unicorn::WorkerKiller::Oom, (192*(1024**2)), (256*(1024**2))
require ::File.expand_path('../config/environment', __FILE__)
run Youji::Application
4. Capistrano部署脚本
set :unicorn_config, "#{current_path}/config/unicorn.rb"
set :unicorn_pid, "#{current_path}/tmp/pids/unicorn.pid"
namespace :deploy do
task :start, :roles => :app, :except => { :no_release => true } do
run "cd #{current_path} && RAILS_ENV=production bundle exec unicorn_rails -c #{unicorn_config} -D"
end
task :stop, :roles => :app, :except => { :no_release => true } do
run "if [ -f #{unicorn_pid} ]; then kill -QUIT `cat #{unicorn_pid}`; fi"
end
task :restart, :roles => :app, :except => { :no_release => true } do
# 用USR2信号来实现无缝部署重启
run "if [ -f #{unicorn_pid} ]; then kill -s USR2 `cat #{unicorn_pid}`; fi"
end
end
完成这些改进以后,部署蝉游记的新版本就只用输入cap production deploy,然后就可以喝茶去了,也不用担心用户在重启动的时候会有短期卡死的问题 :)
补2张图:
new relic的监控图,和启用OOB之前相比,平均响应时间从100ms左右下降到了90ms左右:
[img]http://dl2.iteye.com/upload/attachment/0086/2423/a40a8d88-098b-3f4a-b44f-f1c41b7cd81b.png[/img]
服务器的内存和CPU使用:
[img]http://dl2.iteye.com/upload/attachment/0086/2425/470136d8-df23-3caa-b678-6487701bfa13.png[/img]