分布式高可用键值对数据库Riak - 安装运维篇(1)
快速安装部署启动
我的操作系统是Red Hat Enterprise Linux Server release 6.6 (Santiago),这也是我们目前生产上用的。
我直接在root下安装,先切换到root用户。
首先,安装下需要的软件,Riak官网给的不全:
#su - root
#yum install pam-devel gcc gcc-c++ glibc-devel make ncurses-devel openssl-devel autoconf git
之后,需要下载编译ErLang。因为Riak是Erlang编写的,我们从源代码编译Riak安装。
#wget http://s3.amazonaws.com/downloads.basho.com/erlang/otp_src_R16B02-basho8.tar.gz
#tar zxvf otp_src_R16B02-basho8.tar.gz
#cd OTP_R16B02_basho8/
#./otp_build autoconf
#CFLAGS="-DOPENSSL_NO_EC=1" ./configure && make && sudo make install
#cd ~
安装好后,输入erl, 可以看到
#erl
Erlang R16B02_basho8 (erts-5.10.3) [source] [64-bit] [smp:32:32] [async-threads:10] [hipe] [kernel-poll:false]
Eshell V5.10.3 (abort with ^G)
1>
安装Riak,编译安装5个Riak实例:
#wget http://s3.amazonaws.com/downloads.basho.com/riak/2.1/2.1.4/rhel/6/riak-2.1.4-1.el6.src.rpm
#rpm -ivh riak-2.1.4-1.el6.src.rpm
这之后会把源代码文件夹安装到对应的rpm安装位置,find下即可,我这里是/root/rpmbuild/SOURCES/riak-2.1.4.tar.gz
#mv /root/rpmbuild/SOURCES/riak-2.1.4.tar.gz ./
#tar zxvf riak-2.1.3.tar.gz
#cd riak-2.1.3
#make devrel DEVNODES=5
make的时候会下载一些东西(比如solr等),需要耐心些。安装完成之后5个实例都都在当前目录的dev文件夹下。tree一下
#tree -L 2 dev/
dev/
├── dev1
│ ├── bin
│ ├── data
│ ├── erts-5.10.3
│ ├── etc
│ ├── lib
│ ├── log
│ └── releases
├── dev2
│ ├── bin
│ ├── data
│ ├── erts-5.10.3
│ ├── etc
│ ├── lib
│ ├── log
│ └── releases
├── dev3
│ ├── bin
│ ├── data
│ ├── erts-5.10.3
│ ├── etc
│ ├── lib
│ ├── log
│ └── releases
├── dev4
│ ├── bin
│ ├── data
│ ├── erts-5.10.3
│ ├── etc
│ ├── lib
│ ├── log
│ └── releases
└── dev5
├── bin
├── data
├── erts-5.10.3
├── etc
├── lib
├── log
└── releases
我们先启动一个实例,dev1,首先修改下配置文件:
#cd dev/dev1
#vim ./etc/riak.conf
## Where to emit the default log messages (typically at 'info'
## severity):
## off: disabled
## file: the file specified by log.console.file
## console: to standard output (seen when using `riak attach-direct`)
## both: log.console.file and standard out.
##
## Default: both
##
## Acceptable values:
## - one of: off, file, console, both
log.console = both
## The severity level of the console log, default is 'info'.
##
## Default: info
##
## Acceptable values:
## - one of: debug, info, notice, warning, error, critical, alert, emergency, none
log.console.level = info
## When 'log.console' is set to 'file' or 'both', the file where
## console messages will be logged.
##
## Default: $(platform_log_dir)/console.log
##
## Acceptable values:
## - the path to a file
log.console.file = $(platform_log_dir)/console.log
## The file where error messages will be logged.
##
## Default: $(platform_log_dir)/error.log
##
## Acceptable values:
## - the path to a file
log.error.file = $(platform_log_dir)/error.log
## When set to 'on', enables log output to syslog.
##
## Default: off
##
## Acceptable values:
## - on or off
log.syslog = off
## Whether to enable the crash log.
##
## Default: on
##
## Acceptable values:
## - on or off
log.crash = on
## If the crash log is enabled, the file where its messages will
## be written.
##
## Default: $(platform_log_dir)/crash.log
##
## Acceptable values:
## - the path to a file
log.crash.file = $(platform_log_dir)/crash.log
## Maximum size in bytes of individual messages in the crash log
##
## Default: 64KB
##
## Acceptable values:
## - a byte size with units, e.g. 10GB
log.crash.maximum_message_size = 64KB
## Maximum size of the crash log in bytes, before it is rotated
##
## Default: 10MB
##
## Acceptable values:
## - a byte size with units, e.g. 10GB
log.crash.size = 10MB
## The schedule on which to rotate the crash log. For more
## information see:
## https://github.com/basho/lager/blob/master/README.md#internal-log-rotation
##
## Default: $D0
##
## Acceptable values:
## - text
log.crash.rotation = $D0
## The number of rotated crash logs to keep. When set to
## 'current', only the current open log file is kept.
##
## Default: 5
##
## Acceptable values:
## - an integer
## - the text "current"
log.crash.rotation.keep = 5
## erlang vm shutdown_time is useful when running a riak_test devrel
##
## Default: 10s
##
## Acceptable values:
## - a time duration with units, e.g. '10s' for 10 seconds
erlang.shutdown_time = 10s
## Name of the Erlang node
##
## Default: dev1@127.0.0.1
##
## Acceptable values:
## - text
nodename = dev1@10.202.44.206
## Cookie for distributed node communication. All nodes in the
## same cluster should use the same cookie or they will not be able to
## communicate.
##
## Default: riak
##
## Acceptable values:
## - text
distributed_cookie = riak
## Sets the number of threads in async thread pool, valid range
## is 0-1024. If thread support is available, the default is 64.
## More information at: http://erlang.org/doc/man/erl.html
##
## Default: 64
##
## Acceptable values:
## - an integer
erlang.async_threads = 64
## The number of concurrent ports/sockets
## Valid range is 1024-134217727
##
## Default: 65536
##
## Acceptable values:
## - an integer
erlang.max_ports = 65536
## Set scheduler forced wakeup interval. All run queues will be
## scanned each Interval milliseconds. While there are sleeping
## schedulers in the system, one scheduler will be woken for each
## non-empty run queue found. An Interval of zero disables this
## feature, which also is the default.
## This feature is a workaround for lengthy executing native code, and
## native code that do not bump reductions properly.
## More information: http://www.erlang.org/doc/man/erl.html#+sfwi
##
## Default: 500
##
## Acceptable values:
## - an integer
## erlang.schedulers.force_wakeup_interval = 500
## Enable or disable scheduler compaction of load. By default
## scheduler compaction of load is enabled. When enabled, load
## balancing will strive for a load distribution which causes as many
## scheduler threads as possible to be fully loaded (i.e., not run out
## of work). This is accomplished by migrating load (e.g. runnable
## processes) into a smaller set of schedulers when schedulers
## frequently run out of work. When disabled, the frequency with which
## schedulers run out of work will not be taken into account by the
## load balancing logic.
## More information: http://www.erlang.org/doc/man/erl.html#+scl
##
## Default: false
##
## Acceptable values:
## - one of: true, false
## erlang.schedulers.compaction_of_load = false
## Enable or disable scheduler utilization balancing of load. By
## default scheduler utilization balancing is disabled and instead
## scheduler compaction of load is enabled which will strive for a
## load distribution which causes as many scheduler threads as
## possible to be fully loaded (i.e., not run out of work). When
## scheduler utilization balancing is enabled the system will instead
## try to balance scheduler utilization between schedulers. That is,
## strive for equal scheduler utilization on all schedulers.
## More information: http://www.erlang.org/doc/man/erl.html#+sub
##
## Acceptable values:
## - one of: true, false
## erlang.schedulers.utilization_balancing = true
## Number of partitions in the cluster (only valid when first
## creating the cluster). Must be a power of 2, minimum 8 and maximum
## 1024.
##
## Default: 64
##
## Acceptable values:
## - an integer
## ring_size = 64
## Number of concurrent node-to-node transfers allowed.
##
## Default: 2
##
## Acceptable values:
## - an integer
## transfer_limit = 2
## Default cert location for https can be overridden
## with the ssl config variable, for example:
##
## Acceptable values:
## - the path to a file
## ssl.certfile = $(platform_etc_dir)/cert.pem
## Default key location for https can be overridden with the ssl
## config variable, for example:
##
## Acceptable values:
## - the path to a file
## ssl.keyfile = $(platform_etc_dir)/key.pem
## Default signing authority location for https can be overridden
## with the ssl config variable, for example:
##
## Acceptable values:
## - the path to a file
## ssl.cacertfile = $(platform_etc_dir)/cacertfile.pem
## DTrace support Do not enable 'dtrace' unless your Erlang/OTP
## runtime is compiled to support DTrace. DTrace is available in
## R15B01 (supported by the Erlang/OTP official source package) and in
## R14B04 via a custom source repository & branch.
##
## Default: off
##
## Acceptable values:
## - on or off
dtrace = off
## Platform-specific installation paths (substituted by rebar)
##
## Default: ./bin
##
## Acceptable values:
## - the path to a directory
platform_bin_dir = ./bin
##
## Default: ./data
##
## Acceptable values:
## - the path to a directory
platform_data_dir = ./data
##
## Default: ./etc
##
## Acceptable values:
## - the path to a directory
platform_etc_dir = ./etc
##
## Default: ./lib
##
## Acceptable values:
## - the path to a directory
platform_lib_dir = ./lib
##
## Default: ./log
##
## Acceptable values:
## - the path to a directory
platform_log_dir = ./log
## Enable consensus subsystem. Set to 'on' to enable the
## consensus subsystem used for strongly consistent Riak operations.
##
## Default: off
##
## Acceptable values:
## - on or off
## strong_consistency = on
## listener.http.<name> is an IP address and TCP port that the Riak
## HTTP interface will bind.
##
## Default: 127.0.0.1:10018
##
## Acceptable values:
## - an IP/port pair, e.g. 127.0.0.1:10011
listener.http.internal = 10.202.44.206:10018
## listener.protobuf.<name> is an IP address and TCP port that the Riak
## Protocol Buffers interface will bind.
##
## Default: 127.0.0.1:10017
##
## Acceptable values:
## - an IP/port pair, e.g. 127.0.0.1:10011
listener.protobuf.internal = 10.202.44.206:10017
## The maximum length to which the queue of pending connections
## may grow. If set, it must be an integer > 0. If you anticipate a
## huge number of connections being initialized *simultaneously*, set
## this number higher.
##
## Default: 128
##
## Acceptable values:
## - an integer
## protobuf.backlog = 128
## listener.https.<name> is an IP address and TCP port that the Riak
## HTTPS interface will bind.
##
## Acceptable values:
## - an IP/port pair, e.g. 127.0.0.1:10011
## listener.https.internal = 127.0.0.1:10018
## How Riak will repair out-of-sync keys. Some features require
## this to be set to 'active', including search.
## * active: out-of-sync keys will be repaired in the background
## * passive: out-of-sync keys are only repaired on read
## * active-debug: like active, but outputs verbose debugging
## information
##
## Default: active
##
## Acceptable values:
## - one of: active, passive, active-debug
anti_entropy = active
## Specifies the storage engine used for Riak's key-value data
## and secondary indexes (if supported).
##
## Default: bitcask
##
## Acceptable values:
## - one of: bitcask, leveldb, memory, multi, prefix_multi
storage_backend = bitcask
## Simplify prefix_multi configuration for Riak CS. Keep this
## commented out unless Riak is configured for Riak CS.
##
## Acceptable values:
## - an integer
## cs_version = 20000
## Controls which binary representation of a riak value is stored
## on disk.
## * 0: Original erlang:term_to_binary format. Higher space overhead.
## * 1: New format for more compact storage of small values.
##
## Default: 1
##
## Acceptable values:
## - the integer 1
## - the integer 0
object.format = 1
## Reading or writing objects bigger than this size will write a
## warning in the logs.
##
## Default: 5MB
##
## Acceptable values:
## - a byte size with units, e.g. 10GB
object.size.warning_threshold = 5MB
## Writing an object bigger than this will send a failure to the
## client.
##
## Default: 50MB
##
## Acceptable values:
## - a byte size with units, e.g. 10GB
object.size.maximum = 50MB
## Writing an object with more than this number of siblings will
## generate a warning in the logs.
##
## Default: 25
##
## Acceptable values:
## - an integer
object.siblings.warning_threshold = 25
## Writing an object with more than this number of siblings will
## send a failure to the client.
##
## Default: 100
##
## Acceptable values:
## - an integer
object.siblings.maximum = 100
## A path under which bitcask data files will be stored.
##
## Default: $(platform_data_dir)/bitcask
##
## Acceptable values:
## - the path to a directory
bitcask.data_root = $(platform_data_dir)/bitcask
## Configure how Bitcask writes data to disk.
## erlang: Erlang's built-in file API
## nif: Direct calls to the POSIX C API
## The NIF mode provides higher throughput for certain
## workloads, but has the potential to negatively impact
## the Erlang VM, leading to higher worst-case latencies
## and possible throughput collapse.
##
## Default: erlang
##
## Acceptable values:
## - one of: erlang, nif
bitcask.io_mode = erlang
## Set to 'off' to disable the admin panel.
##
## Default: off
##
## Acceptable values:
## - on or off
riak_control = off
## Authentication mode used for access to the admin panel.
##
## Default: off
##
## Acceptable values:
## - one of: off, userlist
riak_control.auth.mode = off
## If riak control's authentication mode (riak_control.auth.mode)
## is set to 'userlist' then this is the list of usernames and
## passwords for access to the admin panel.
## To create users with given names, add entries of the format:
## riak_control.auth.user.USERNAME.password = PASSWORD
## replacing USERNAME with the desired username and PASSWORD with the
## desired password for that user.
##
## Acceptable values:
## - text
## riak_control.auth.user.admin.password = pass
## This parameter defines the percentage of total server memory
## to assign to LevelDB. LevelDB will dynamically adjust its internal
## cache sizes to stay within this size. The memory size can
## alternately be assigned as a byte count via leveldb.maximum_memory
## instead.
##
## Default: 70
##
## Acceptable values:
## - an integer
leveldb.maximum_memory.percent = 70
## To enable Search set this 'on'.
##
## Default: off
##
## Acceptable values:
## - on or off
search = off
## How long Riak will wait for Solr to start. The start sequence
## will be tried twice. If both attempts timeout, then the Riak node
## will be shutdown. This may need to be increased as more data is
## indexed and Solr takes longer to start. Values lower than 1s will
## be rounded up to the minimum 1s.
##
## Default: 30s
##
## Acceptable values:
## - a time duration with units, e.g. '10s' for 10 seconds
search.solr.start_timeout = 30s
## The port number which Solr binds to.
## NOTE: Binds on every interface.
##
## Default: 10014
##
## Acceptable values:
## - an integer
search.solr.port = 10014
## The port number which Solr JMX binds to.
## NOTE: Binds on every interface.
##
## Default: 10013
##
## Acceptable values:
## - an integer
search.solr.jmx_port = 10013
## The options to pass to the Solr JVM. Non-standard options,
## i.e. -XX, may not be portable across JVM implementations.
## E.g. -XX:+UseCompressedStrings
##
## Default: -d64 -Xms1g -Xmx1g -XX:+UseStringCache -XX:+UseCompressedOops
##
## Acceptable values:
## - text
search.solr.jvm_options = -d64 -Xms1g -Xmx1g -XX:+UseStringCache -XX:+UseCompressedOops
主要修改:
nodename
listener.http.internal
listener.protobuf.internal
这三项为自己实际要绑定的IP即可。
之后启动:
#./bin/riak start
启动成功后,查看状态和集群状态:
#./bin/riak-admin cluster status
---- Cluster Status ----
Ring ready: true
+------------------------+------+-------+-----+-------+
| node |status| avail |ring |pending| +------------------------+------+-------+-----+-------+
| (C) dev1@10.202.44.206 |valid | up |100.0| -- | +------------------------+------+-------+-----+-------+
Key: (C) = Claimant; availability marked with '!' is unexpected
再配置两个node,分别是dev2和dev3文件夹下的,同样只是修改那三个属性的IP即可,端口已经自动写好了,不用改,之后启动。
#./dev2/bin/riak start
#./dev2/bin/riak-admin cluster status
---- Cluster Status ----
Ring ready: true
+------------------------+------+-------+-----+-------+
| node |status| avail |ring |pending| +------------------------+------+-------+-----+-------+
| (C) dev2@10.202.44.206 |valid | up |100.0| -- | +------------------------+------+-------+-----+-------+
Key: (C) = Claimant; availability marked with '!' is unexpected
#./dev3/bin/riak start
#./dev3/bin/riak-admin cluster status
---- Cluster Status ----
Ring ready: true
+------------------------+------+-------+-----+-------+
| node |status| avail |ring |pending| +------------------------+------+-------+-----+-------+
| (C) dev3@10.202.44.206 |valid | up |100.0| -- | +------------------------+------+-------+-----+-------+
Key: (C) = Claimant; availability marked with '!' is unexpected
之后,开始配置一个三个Riak node的集群。首先,将dev1@10.202.44.206和dev2@10.202.44.206组成一个集群。
#./dev2/bin/riak-admin cluster join dev1@10.202.44.206
Success: staged join request for 'dev2@10.202.44.206' to 'dev1@10.202.44.206'
#./dev2/bin/riak-admin cluster plan
=============================== Staged Changes ================================
Action Details(s) -------------------------------------------------------------------------------
join 'dev2@10.202.44.206' -------------------------------------------------------------------------------
NOTE: Applying these changes will result in 1 cluster transition
###############################################################################
After cluster transition 1/1
###############################################################################
================================= Membership ==================================
Status Ring Pending Node -------------------------------------------------------------------------------
valid 100.0% 50.0% 'dev1@10.202.44.206'
valid 0.0% 50.0% 'dev2@10.202.44.206' -------------------------------------------------------------------------------
Valid:2 / Leaving:0 / Exiting:0 / Joining:0 / Down:0
WARNING: Not all replicas will be on distinct nodes
Transfers resulting from cluster changes: 32
32 transfers from 'dev1@10.202.44.206' to 'dev2@10.202.44.206'
#./dev2/bin/riak-admin cluster commit
Cluster changes committed
#./dev2/bin/riak-admin cluster status
---- Cluster Status ----
Ring ready: true
+------------------------+------+-------+-----+-------+
| node |status| avail |ring |pending| +------------------------+------+-------+-----+-------+
| (C) dev1@10.202.44.206 |valid | up | 87.5| 50.0 |
| dev2@10.202.44.206 |valid | up | 12.5| 50.0 | +------------------------+------+-------+-----+-------+
Key: (C) = Claimant; availability marked with '!' is unexpected
#./dev2/bin/riak-admin cluster status
---- Cluster Status ----
Ring ready: true
+------------------------+------+-------+-----+-------+
| node |status| avail |ring |pending| +------------------------+------+-------+-----+-------+
| (C) dev1@10.202.44.206 |valid | up | 75.0| 50.0 |
| dev2@10.202.44.206 |valid | up | 25.0| 50.0 | +------------------------+------+-------+-----+-------+
Key: (C) = Claimant; availability marked with '!' is unexpected
#./dev2/bin/riak-admin cluster status
---- Cluster Status ----
Ring ready: true
+------------------------+------+-------+-----+-------+
| node |status| avail |ring |pending| +------------------------+------+-------+-----+-------+
| (C) dev1@10.202.44.206 |valid | up | 62.5| 50.0 |
| dev2@10.202.44.206 |valid | up | 37.5| 50.0 | +------------------------+------+-------+-----+-------+
Key: (C) = Claimant; availability marked with '!' is unexpected
#./dev2/bin/riak-admin cluster status
---- Cluster Status ----
Ring ready: true
+------------------------+------+-------+-----+-------+
| node |status| avail |ring |pending| +------------------------+------+-------+-----+-------+
| (C) dev1@10.202.44.206 |valid | up | 50.0| -- |
| dev2@10.202.44.206 |valid | up | 50.0| -- | +------------------------+------+-------+-----+-------+
Key: (C) = Claimant; availability marked with '!' is unexpected
分为三步,先join某一个集群(只用填集群中一个节点的名字即可),之后plan,看下分布和需要的移动操作(这些在commit之后riak会自己做),最后确认无误,则commit。
对于dev3@10.202.44.206也是一样:
#./dev3/bin/riak-admin cluster join dev1@10.202.44.206
Success: staged join request for 'dev3@10.202.44.206' to 'dev1@10.202.44.206'
#./dev3/bin/riak-admin cluster status
---- Cluster Status ----
Ring ready: true
+------------------------+-------+-------+-----+-------+
| node |status | avail |ring |pending| +------------------------+-------+-------+-----+-------+
| dev3@10.202.44.206 |joining| up | 0.0| -- |
| (C) dev1@10.202.44.206 | valid | up | 50.0| -- |
| dev2@10.202.44.206 | valid | up | 50.0| -- | +------------------------+-------+-------+-----+-------+
Key: (C) = Claimant; availability marked with '!' is unexpected
#./dev3/bin/riak-admin cluster plan
=============================== Staged Changes ================================
Action Details(s) -------------------------------------------------------------------------------
join 'dev3@10.202.44.206' -------------------------------------------------------------------------------
NOTE: Applying these changes will result in 1 cluster transition
###############################################################################
After cluster transition 1/1
###############################################################################
================================= Membership ==================================
Status Ring Pending Node -------------------------------------------------------------------------------
valid 50.0% 34.4% 'dev1@10.202.44.206'
valid 50.0% 32.8% 'dev2@10.202.44.206'
valid 0.0% 32.8% 'dev3@10.202.44.206' -------------------------------------------------------------------------------
Valid:3 / Leaving:0 / Exiting:0 / Joining:0 / Down:0
WARNING: Not all replicas will be on distinct nodes
Transfers resulting from cluster changes: 21
10 transfers from 'dev1@10.202.44.206' to 'dev3@10.202.44.206'
11 transfers from 'dev2@10.202.44.206' to 'dev3@10.202.44.206'
#./dev3/bin/riak-admin cluster commit
Cluster changes committed
#./dev3/bin/riak-admin cluster status
---- Cluster Status ----
Ring ready: true
+------------------------+------+-------+-----+-------+
| node |status| avail |ring |pending| +------------------------+------+-------+-----+-------+
| (C) dev1@10.202.44.206 |valid | up | 43.8| 34.4 |
| dev2@10.202.44.206 |valid | up | 50.0| 32.8 |
| dev3@10.202.44.206 |valid | up | 6.3| 32.8 | +------------------------+------+-------+-----+-------+
Key: (C) = Claimant; availability marked with '!' is unexpected
#./dev3/bin/riak-admin cluster status
---- Cluster Status ----
Ring ready: true
+------------------------+------+-------+-----+-------+
| node |status| avail |ring |pending| +------------------------+------+-------+-----+-------+
| (C) dev1@10.202.44.206 |valid | up | 34.4| -- |
| dev2@10.202.44.206 |valid | up | 32.8| -- |
| dev3@10.202.44.206 |valid | up | 32.8| -- | +------------------------+------+-------+-----+-------+
Key: (C) = Claimant; availability marked with '!' is unexpected
这里riak-admin cluster status可以查看集群状态。status是目前每个节点的状态。avail代表是否可以访问,ring就是每个节点持有多少百分比的数据。和Dynamo的思想一致,riak以一致性哈希环保存数据。这个ring就是虚节点,里面的百分比就是每个节点持有虚节点个数。