当前位置: 首页 > 工具软件 > go-deploy > 使用案例 >

ceph-deploy源码分析(三)——mon模块 <转>

曾晨
2023-12-01

ceph-deploy源码分析(三)——mon模块

原文: http://www.hl10502.com/2017/06/19/ceph-deploy-mon/#more

ceph-deploy的mon.py模块是用来管理mon守护进程。

mon 子命令格式如下

ceph-deploy mon [-h] {add,create,create-initial,destroy} ...

  • create: 创建mon
  • add: 添加mon到集群,添加之前先在ceph.conf配置public_network
  • create-initial: 创建mon并初始化
  • destroy: 删除mon,如果集群只有一个mon则不能删除

在创建mon时,建议使用 ceph-deploy mon create-initial 来创建。

mon管理

make函数priority为30,子命令设置的默认函数为mon函数。

@priority(30)
def make(parser):
    """
    Ceph MON Daemon management
    """
    parser.formatter_class = ToggleRawTextHelpFormatter
    mon_parser = parser.add_subparsers(dest='subcommand')
    mon_parser.required = True
    mon_add = mon_parser.add_parser(
        'add',
        help=('R|Add a monitor to an existing cluster:\n'
              '\tceph-deploy mon add node1\n'
              'Or:\n'
              '\tceph-deploy mon add --address 192.168.1.10 node1\n'
              'If the section for the monitor exists and defines a `mon addr` that\n'
              'will be used, otherwise it will fallback by resolving the hostname to an\n'
              'IP. If `--address` is used it will override all other options.')
    )
    mon_add.add_argument(
        '--address',
        nargs='?',
    )
    mon_add.add_argument(
        'mon',
        nargs=1,
    )
    mon_create = mon_parser.add_parser(
        'create',
        help=('R|Deploy monitors by specifying them like:\n'
              '\tceph-deploy mon create node1 node2 node3\n'
              'If no hosts are passed it will default to use the\n'
              '`mon initial members` defined in the configuration.')
    )
    mon_create.add_argument(
        '--keyrings',
        nargs='?',
        help='concatenate multiple keyrings to be seeded on new monitors',
    )
    mon_create.add_argument(
        'mon',
        nargs='*',
    )
    mon_create_initial = mon_parser.add_parser(
        'create-initial',
        help=('Will deploy for monitors defined in `mon initial members`, '
              'wait until they form quorum and then gatherkeys, reporting '
              'the monitor status along the process. If monitors don\'t form '
              'quorum the command will eventually time out.')
    )
    mon_create_initial.add_argument(
        '--keyrings',
        nargs='?',
        help='concatenate multiple keyrings to be seeded on new monitors',
    )
    mon_destroy = mon_parser.add_parser(
        'destroy',
        help='Completely remove Ceph MON from remote host(s)'
    )
    mon_destroy.add_argument(
        'mon',
        nargs='+',
    )
    parser.set_defaults(
        func=mon,
    )

mon子命令

mon 函数,mon有四个subcmd为create、add、destroy、create-initial,分别对应mon_create、mon_add、mon_destroy、mon_create_initial四个函数

def mon(args):
    if args.subcommand == 'create':
        mon_create(args)
    elif args.subcommand == 'add':
        mon_add(args)
    elif args.subcommand == 'destroy':
        mon_destroy(args)
    elif args.subcommand == 'create-initial':
        mon_create_initial(args)
    else:
        LOG.error('subcommand %s not implemented', args.subcommand)

创建mon

命令行格式为: ceph-deploy mon create [node1] [node2] [node3] …

mon_create函数创建mon

  • args参数校验
  • 调用hosts.get函数获取操作系统版本信息,检查是否安装ceph包,目前支持的操作系统为centos/debian/fedora/rhel/suse,如果需要修改代码支持其他操作系统,可以从hosts入手修改,比如增加支持XenServer
  • 调用相应操作系统模块下的create函数创建mon,比如操作系统为centos,即hosts/centos下的mon模块
  • 调用mon_status函数检测mon状态
  • 调用catch_mon_errors函数获取mon的错误信息,写入logger
  • def mon_create(args):
        # 获取配置文件
        cfg = conf.ceph.load(args)
        if not args.mon:
            # 参数没指定mon,调用get_mon_initial_members函数从配置文件获取mon_initial_members作为mon
            args.mon = get_mon_initial_members(args, error_on_empty=True, _cfg=cfg)
        if args.keyrings:
            monitor_keyring = concatenate_keyrings(args)
        else:
            keyring_path = '{cluster}.mon.keyring'.format(cluster=args.cluster)
            try:
                # 获取ceph.mon.keyring文件信息
                monitor_keyring = files.read_file(keyring_path)
            except IOError:
                LOG.warning('keyring (%s) not found, creating a new one' % keyring_path)
                new_mon_keyring(args)
                monitor_keyring = files.read_file(keyring_path)
        LOG.debug(
            'Deploying mon, cluster %s hosts %s',
            args.cluster,
            ' '.join(args.mon),
            )
        errors = 0
        # 循环mon
        for (name, host) in mon_hosts(args.mon):
            try:
                # TODO add_bootstrap_peer_hint
                LOG.debug('detecting platform for host %s ...', name)
                # 获取操作系统版本信息,检查是否安装ceph包,如果需要修改代码支持其他操作系统,可以从hosts入手修改
                distro = hosts.get(
                    host,
                    username=args.username,
                    callbacks=[packages.ceph_is_installed]
                )
                LOG.info('distro info: %s %s %s', distro.name, distro.release, distro.codename)
                rlogger = logging.getLogger(name)
                # ensure remote hostname is good to go
                hostname_is_compatible(distro.conn, rlogger, name)
                rlogger.debug('deploying mon to %s', name)
                # 创建mon,调用hosts目录的相应操作系列目录,比如系统是centos,那就是hosts/centos下的mon模块
                distro.mon.create(distro, args, monitor_keyring)
                # tell me the status of the deployed mon
                time.sleep(2)  # give some room to start
                # 检测mon的状态
                mon_status(distro.conn, rlogger, name, args)
                # 获取mon的错误信息,写入logger
                catch_mon_errors(distro.conn, rlogger, name, cfg, args)
                distro.conn.exit()
            except RuntimeError as e:
                LOG.error(e)
                errors += 1
        if errors:
            raise exc.GenericError('Failed to create %d monitors' % errors)

以centos为例,hosts/centos下的mon模块的 init .py

from ceph_deploy.hosts.common import mon_add as add  # noqa
from ceph_deploy.hosts.common import mon_create as create  # noqa


mon_create函数在hosts/common.py模块

  • 创建/etc/ceph/ceph.conf文件,从cph-deploy new创建的ceph.conf与args中写入数据到/etc/ceph/ceph.conf
  • 将ceph-deploy new创建的ceph.mon.keyring文件内容写入临时文件/var/lib/ceph/tmp/ceph.mon.keyring
  • 创建mon,初始化keyring数据
  • 创建空白的done文件,并将文件的拥有者设置成uid、gid
  • 创建init文件systemd,并将文件的拥有者设置成uid、gid
  • 设置mon服务开机启动,启动mon服务
  • def mon_create(distro, args, monitor_keyring):
        hostname = distro.conn.remote_module.shortname()
        logger = distro.conn.logger
        logger.debug('remote hostname: %s' % hostname)
        # mon目录,比如:/var/lib/ceph/mon/ceph-1
        path = paths.mon.path(args.cluster, hostname)
        # 获取/var/lib/ceph目录的用户id
        uid = distro.conn.remote_module.path_getuid(constants.base_path)
        # 获取/var/lib/ceph目录的用户组gid
        gid = distro.conn.remote_module.path_getgid(constants.base_path)
        # 获取mon下的done文件路径,比如:/var/lib/ceph/mon/ceph-1/done
        done_path = paths.mon.done(args.cluster, hostname)
        # 获取mon下的systemd文件路径,比如:/var/lib/ceph/mon/ceph-1/systemd
        init_path = paths.mon.init(args.cluster, hostname, distro.init)
        # 获取ceph-deploy创建的ceph.conf文件数据
        conf_data = conf.ceph.load_raw(args)
        # write the configuration file
        # 写入/etc/ceph/ceph.conf
        distro.conn.remote_module.write_conf(
            args.cluster,
            conf_data,
            args.overwrite_conf,
        )
        # if the mon path does not exist, create it
        # 如果mon文件目录不存在,创建mon目录,并将目录的拥有者改成uid、gid
        distro.conn.remote_module.create_mon_path(path, uid, gid)
        logger.debug('checking for done path: %s' % done_path)
        if not distro.conn.remote_module.path_exists(done_path):
            # done文件不存在
            logger.debug('done path does not exist: %s' % done_path)
            if not distro.conn.remote_module.path_exists(paths.mon.constants.tmp_path):
                # /var/lib/ceph/tmp目录不存在
                logger.info('creating tmp path: %s' % paths.mon.constants.tmp_path)
                # 创建/var/lib/ceph/tmp目录
                distro.conn.remote_module.makedir(paths.mon.constants.tmp_path)
            # 获取/var/lib/ceph/tmp/ceph.mon.keyring
            keyring = paths.mon.keyring(args.cluster, hostname)
            logger.info('creating keyring file: %s' % keyring)
            # 将ceph-deploy new创建的ceph.mon.keyring文件内容写入临时文件/var/lib/ceph/tmp/ceph.mon.keyring
            distro.conn.remote_module.write_monitor_keyring(
                keyring,
                monitor_keyring,
                uid, gid,
            )
            user_args = []
            if uid != 0:
                user_args = user_args + [ '--setuser', str(uid) ]
            if gid != 0:
                user_args = user_args + [ '--setgroup', str(gid) ]
            # 创建mon
            remoto.process.run(
                distro.conn,
                [
                    'ceph-mon',
                    '--cluster', args.cluster,
                    '--mkfs',
                    '-i', hostname,
                    '--keyring', keyring,
                ] + user_args
            )
            logger.info('unlinking keyring file %s' % keyring)
            distro.conn.remote_module.unlink(keyring)
        # create the done file
        # 创建空白的done文件,并将文件的拥有者设置成uid、gid,表示mon创建完成
        distro.conn.remote_module.create_done_path(done_path, uid, gid)
        # create init path
        # 创建init文件,并将文件的拥有者设置成uid、gid
        distro.conn.remote_module.create_init_path(init_path, uid, gid)
        # start mon service
        # 启动mon服务
        start_mon_service(distro, args.cluster, hostname)

添加mon到集群

命令行格式为: ceph-deploy mon add [--address [ADDRESS]] mon …

mon_add函数

  • 调用admin模块admin函数(即ceph-deploy admin子命令),远程host上写入/etc/ceph/ceph.conf与/etc/ceph/ceph.client.admin.keyring文件
  • 调用操作系统类型相应的mon的add函数,创建mon
  • 调用catch_mon_errors函数校验error
  • 调用mon_status校验mon的状态
  • def mon_add(args):
        cfg = conf.ceph.load(args)
        # args.mon is a list with only one entry
        mon_host = args.mon[0]
        # 获取ceph.mon.keyring文件配置信息
        try:
            with open('{cluster}.mon.keyring'.format(cluster=args.cluster),
                      'rb') as f:
                monitor_keyring = f.read()
        except IOError:
            raise RuntimeError(
                'mon keyring not found; run \'new\' to create a new cluster'
            )
        LOG.info('ensuring configuration of new mon host: %s', mon_host)
        args.client = args.mon
        # 调用admin模块admin函数,在远程host上写入/etc/ceph/ceph.conf与/etc/ceph/ceph.client.admin.keyring文件
        admin.admin(args)
        LOG.debug(
            'Adding mon to cluster %s, host %s',
            args.cluster,
            mon_host,
        )
        mon_section = 'mon.%s' % mon_host
        cfg_mon_addr = cfg.safe_get(mon_section, 'mon addr')
        # 校验mon ip
        if args.address:
            LOG.debug('using mon address via --address %s' % args.address)
            mon_ip = args.address
        elif cfg_mon_addr:
            LOG.debug('using mon address via configuration: %s' % cfg_mon_addr)
            mon_ip = cfg_mon_addr
        else:
            mon_ip = net.get_nonlocal_ip(mon_host)
            LOG.debug('using mon address by resolving host: %s' % mon_ip)
        try:
            LOG.debug('detecting platform for host %s ...', mon_host)
            distro = hosts.get(
                mon_host,
                username=args.username,
                callbacks=[packages.ceph_is_installed]
            )
            LOG.info('distro info: %s %s %s', distro.name, distro.release, distro.codename)
            rlogger = logging.getLogger(mon_host)
            # ensure remote hostname is good to go
            hostname_is_compatible(distro.conn, rlogger, mon_host)
            rlogger.debug('adding mon to %s', mon_host)
            args.address = mon_ip
            # 添加mon到集群
            distro.mon.add(distro, args, monitor_keyring)
            # tell me the status of the deployed mon
            time.sleep(2)  # give some room to start
            # 获取mon的错误信息写入logger warning
            catch_mon_errors(distro.conn, rlogger, mon_host, cfg, args)
            # 校验mon的状态
            mon_status(distro.conn, rlogger, mon_host, args)
            distro.conn.exit()
        except RuntimeError as e:
            LOG.error(e)
            raise exc.GenericError('Failed to add monitor to host:  %s' % mon_host)

add函数,以centos为例,在hosts/centos/mon/ init .py

from ceph_deploy.hosts.common import mon_add as add  # noqa
from ceph_deploy.hosts.common import mon_create as create  # noqa

hosts/common.py模块的mon_add函数

  • 创建/etc/ceph/ceph.conf文件,从cph-deploy new创建的ceph.conf与args中写入数据到/etc/ceph/ceph.conf
  • 将ceph-deploy new创建的ceph.mon.keyring文件内容写入临时文件/var/lib/ceph/tmp/ceph.mon.keyring
  • 获取mon的monmap
  • 创建mon,初始化monmap和keyring数据
  • 创建空白的done文件,并将文件的拥有者设置成uid、gid
  • 创建init文件systemd,并将文件的拥有者设置成uid、gid
  • 设置mon服务开机启动,启动mon服务
def mon_add(distro, args, monitor_keyring):
    hostname = distro.conn.remote_module.shortname()
    logger = distro.conn.logger
    path = paths.mon.path(args.cluster, hostname)
    uid = distro.conn.remote_module.path_getuid(constants.base_path)
    gid = distro.conn.remote_module.path_getgid(constants.base_path)
    monmap_path = paths.mon.monmap(args.cluster, hostname)
    done_path = paths.mon.done(args.cluster, hostname)
    init_path = paths.mon.init(args.cluster, hostname, distro.init)
    conf_data = conf.ceph.load_raw(args)
    # write the configuration file
    distro.conn.remote_module.write_conf(
        args.cluster,
        conf_data,
        args.overwrite_conf,
    )
    # if the mon path does not exist, create it
    distro.conn.remote_module.create_mon_path(path, uid, gid)
    logger.debug('checking for done path: %s' % done_path)
    if not distro.conn.remote_module.path_exists(done_path):
        logger.debug('done path does not exist: %s' % done_path)
        if not distro.conn.remote_module.path_exists(paths.mon.constants.tmp_path):
            logger.info('creating tmp path: %s' % paths.mon.constants.tmp_path)
            distro.conn.remote_module.makedir(paths.mon.constants.tmp_path)
        keyring = paths.mon.keyring(args.cluster, hostname)
        logger.info('creating keyring file: %s' % keyring)
        distro.conn.remote_module.write_monitor_keyring(
            keyring,
            monitor_keyring,
            uid, gid,
        )
        # get the monmap
        remoto.process.run(
            distro.conn,
            [
                'ceph',
                '--cluster', args.cluster,
                'mon',
                'getmap',
                '-o',
                monmap_path,
            ],
        )
        # now use it to prepare the monitor's data dir
        user_args = []
        if uid != 0:
            user_args = user_args + [ '--setuser', str(uid) ]
        if gid != 0:
            user_args = user_args + [ '--setgroup', str(gid) ]
        remoto.process.run(
            distro.conn,
            [
                'ceph-mon',
                '--cluster', args.cluster,
                '--mkfs',
                '-i', hostname,
                '--monmap',
                monmap_path,
                '--keyring', keyring,
            ] + user_args
        )
        logger.info('unlinking keyring file %s' % keyring)
        distro.conn.remote_module.unlink(keyring)
    # create the done file
    distro.conn.remote_module.create_done_path(done_path, uid, gid)
    # create init path
    distro.conn.remote_module.create_init_path(init_path, uid, gid)
    # start mon service
    start_mon_service(distro, args.cluster, hostname)

删除mon

命令行格式为: ceph-deploy mon destroy [-h] mon [mon …]

mon_destroy函数,调用destroy_mon函数

def mon_destroy(args):
    errors = 0
    for (name, host) in mon_hosts(args.mon):
        try:
            LOG.debug('Removing mon from %s', name)
            distro = hosts.get(
                host,
                username=args.username,
                callbacks=[packages.ceph_is_installed]
            )
            hostname = distro.conn.remote_module.shortname()
            # 删除mon
            destroy_mon(
                distro.conn,
                args.cluster,
                hostname,
            )
            distro.conn.exit()
        except RuntimeError as e:
            LOG.error(e)
            errors += 1
    if errors:
        raise exc.GenericError('Failed to destroy %d monitors' % errors)

destroy_mon函数

  • 从集群中删除mon
  • 停止mon服务
  • 将mon数据文件移动到/var/lib/ceph/mon-removed目录归档文件
  • def destroy_mon(conn, cluster, hostname):
        import datetime
        import time
        retries = 5
        # mon的目录,比如/var/lib/ceph/mon/ceph-node1
        path = paths.mon.path(cluster, hostname)
        if conn.remote_module.path_exists(path):
            # remove from cluster
            # 从集群中删除mon
            remoto.process.run(
                conn,
                [
                    'ceph',
                    '--cluster={cluster}'.format(cluster=cluster),
                    '-n', 'mon.',
                    '-k', '{path}/keyring'.format(path=path),
                    'mon',
                    'remove',
                    hostname,
                ],
                timeout=7,
            )
            # stop
            if conn.remote_module.path_exists(os.path.join(path, 'upstart')) or system.is_upstart(conn):
                status_args = [
                    'initctl',
                    'status',
                    'ceph-mon',
                    'cluster={cluster}'.format(cluster=cluster),
                    'id={hostname}'.format(hostname=hostname),
                ]
            elif conn.remote_module.path_exists(os.path.join(path, 'sysvinit')):
                status_args = [
                    'service',
                    'ceph',
                    'status',
                    'mon.{hostname}'.format(hostname=hostname),
                ]
            elif system.is_systemd(conn):
                # 停止mon服务
                status_args = [
                    'systemctl',
                    'stop',
                    'ceph-mon@{hostname}.service'.format(hostname=hostname),
                ]
            else:
                raise RuntimeError('could not detect a supported init system, cannot continue')
            while retries:
                conn.logger.info('polling the daemon to verify it stopped')
                if is_running(conn, status_args):
                    time.sleep(5)
                    retries -= 1
                    if retries <= 0:
                        raise RuntimeError('ceph-mon deamon did not stop')
                else:
                    break
            # archive old monitor directory
            fn = '{cluster}-{hostname}-{stamp}'.format(
                hostname=hostname,
                cluster=cluster,
                stamp=datetime.datetime.utcnow().strftime("%Y-%m-%dZ%H:%M:%S"),
                )
            # 创建/var/lib/ceph/mon-removed目录
            remoto.process.run(
                conn,
                [
                    'mkdir',
                    '-p',
                    '/var/lib/ceph/mon-removed',
                ],
            )
            # 将mon数据文件移动到/var/lib/ceph/mon-removed目录归档文件
            conn.remote_module.make_mon_removed_dir(path, fn)

创建mon并初始化

命令行格式为: ceph-deploy mon create-initial [-h] [--keyrings [KEYRINGS]]

mon_create_initial函数

  • 调用mon_create函数,创建mon,可参考create部分
  • 判断集群中的mon_in_quorum与ceph.conf中的mon_initial_members是否完全匹配
  • 调用gatherkeys模块gatherkeys函数,收集用于配置新节点的keys
  • def mon_create_initial(args):
        # 获取ceph.conf中的mon_initial_members
        mon_initial_members = get_mon_initial_members(args, error_on_empty=True)
        # create them normally through mon_create
        args.mon = mon_initial_members
        # 创建mon
        mon_create(args)
        # make the sets to be able to compare late
        mon_in_quorum = set([])
        mon_members = set([host for host in mon_initial_members])
        for host in mon_initial_members:
            mon_name = 'mon.%s' % host
            LOG.info('processing monitor %s', mon_name)
            sleeps = [20, 20, 15, 10, 10, 5]
            tries = 5
            rlogger = logging.getLogger(host)
            distro = hosts.get(
                host,
                username=args.username,
                callbacks=[packages.ceph_is_installed]
            )
            while tries:
                # 获取mon的状态
                status = mon_status_check(distro.conn, rlogger, host, args)
                has_reached_quorum = status.get('state', '') in ['peon', 'leader']
                if not has_reached_quorum:
                    LOG.warning('%s monitor is not yet in quorum, tries left: %s' % (mon_name, tries))
                    tries -= 1
                    sleep_seconds = sleeps.pop()
                    LOG.warning('waiting %s seconds before retrying', sleep_seconds)
                    time.sleep(sleep_seconds)  # Magic number
                else:
                    mon_in_quorum.add(host)
                    LOG.info('%s monitor has reached quorum!', mon_name)
                    break
            distro.conn.exit()
        # 集群中的mon_in_quorum与ceph.conf中的mon_initial_members完全匹配
        if mon_in_quorum == mon_members:
            LOG.info('all initial monitors are running and have formed quorum')
            LOG.info('Running gatherkeys...')
            # 调用gatherkeys模块gatherkeys函数,收集用于配置新节点的keys
            gatherkeys.gatherkeys(args)
        else:
            LOG.error('Some monitors have still not reached quorum:')
            for host in mon_members - mon_in_quorum:
                LOG.error('%s', host)
            raise SystemExit('cluster may not be in a healthy state')

手工管理mon

创建mon

以ceph-deploy创建mon:ceph-deploy mon create ceph-231为例,对应的手工操作。

生成fsid

[root@ceph-231 ~]# uuidgen
a3b9b0aa-01ab-4e1b-bba3-6f5317b0795b

创建ceph.conf

[root@ceph-231 ~]# vi /etc/ceph/ceph.conf
[global]
fsid = a3b9b0aa-01ab-4e1b-bba3-6f5317b0795b
mon_initial_members = ceph-231
mon_host = 192.168.217.231
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx

创建ceph-ceph-231.mon.keyring

创建keyring

[root@ceph-231 ~]# ceph-authtool --create-keyring /var/lib/ceph/tmp/ceph-ceph-231.mon.keyring --gen-key -n mon. --cap mon 'allow *'
[root@ceph-231 ~]# cat /var/lib/ceph/tmp/ceph-ceph-231.mon.keyring
[mon.]
        key = AQBzv0hZAAAAABAAJLiETzmegHWmVO7JwvsMdQ==
        caps mon = "allow *"


改变文件拥有者为ceph

[root@ceph-231 ~]# chown ceph:ceph /var/lib/ceph/tmp/ceph-ceph-231.mon.keyring

创建mon

获取ceph用户的uid、gid

[root@ceph-231 ~]# id ceph
uid=167(ceph) gid=167(ceph) groups=167(ceph)

创建mon

[root@ceph-231 ~]# ceph-mon --cluster ceph --mkfs -i ceph-231 --keyring /var/lib/ceph/tmp/ceph-ceph-231.mon.keyring --setuser 167 --setgroup 167

创建done文件

[root@ceph-231 ~]# touch /var/lib/ceph/mon/ceph-ceph-231/done
[root@ceph-231 ~]# chown ceph:ceph /var/lib/ceph/mon/ceph-ceph-231/done


创建init文件

查看init

[root@ceph-231 ~]# cat /proc/1/comm
systemd

创建systemd文件

[root@ceph-231 ~]# touch /var/lib/ceph/mon/ceph-ceph-231/systemd
[root@ceph-231 ~]# chown ceph:ceph /var/lib/ceph/mon/ceph-ceph-231/systemd

启动mon

[root@ceph-231 ~]# systemctl enable ceph.target
[root@ceph-231 ~]# systemctl enable ceph-mon@ceph-231
[root@ceph-231 ~]# systemctl start ceph-mon@ceph-231

查看mon状态

  • rank大于等于0,表示mon正在运行
  • rank等于-1,state为mon的状态
  • [root@ceph-231 ceph-ceph-231]# ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-231.asok mon_status
    {
        "name": "ceph-231",
        "rank": 0,
        "state": "leader",
        "election_epoch": 3,
        "quorum": [
            0
        ],
        "outside_quorum": [],
        "extra_probe_peers": [],
        "sync_provider": [],
        "monmap": {
            "epoch": 1,
            "fsid": "a3b9b0aa-01ab-4e1b-bba3-6f5317b0795b",
            "modified": "2017-06-20 15:53:44.533604",
            "created": "2017-06-20 15:53:44.533604",
            "mons": [
                {
                    "rank": 0,
                    "name": "ceph-231",
                    "addr": "192.168.217.231:6789\/0"
                }
            ]
        }
    }

添加mon到集群

集群ceph-231上添加ceph-232这个mon。

ceph.conf添加public_network

修改ceph.conf文件,添加 public_network=192.168.217.0/24

[root@ceph-231 ~]# vi /etc/ceph/ceph.conf

copy配置与keyring

复制/etc/ceph/ceph.conf与/etc/ceph/ceph.client.admin.keyring文件

[root@ceph-231 ~]# scp /etc/ceph/ceph.conf root@ceph-232:/etc/ceph
[root@ceph-231 ~]# scp /etc/ceph/ceph.client.admin.keyring root@ceph-232:/etc/ceph

复制mon的keyring

[root@ceph-231 ~]# scp /var/lib/ceph/mon/ceph-ceph-231/keyring root@ceph-232:/var/lib/ceph/tmp/ceph-ceph-232.mon.keyring

设置keyring的用户权限

[root@ceph-232 ~]# chown ceph:ceph /var/lib/ceph/tmp/ceph-ceph-232.mon.keyring

创建mon

获取ceph-232的monmap

[root@ceph-232 ~]# ceph --cluster ceph mon getmap -o /var/lib/ceph/tmp/ceph.ceph-232.monmap

获取用户ceph的uid、gid

[root@ceph-232 ~]# id ceph
uid=167(ceph) gid=167(ceph) groups=167(ceph)

创建ceph-232的mon

[root@ceph-232 ~]# ceph-mon --cluster ceph --mkfs -i ceph-232 --monmap /var/lib/ceph/tmp/ceph.ceph-232.monmap --keyring /var/lib/ceph/tmp/ceph-ceph-232.mon.keyring --setuser 167 --setgroup 167

创建done文件

[root@ceph-232 ~]# touch /var/lib/ceph/mon/ceph-ceph-232/done
[root@ceph-232 ~]# chown ceph:ceph /var/lib/ceph/mon/ceph-ceph-232/done

创建init文件

查看init

[root@ceph-232 ~]# cat /proc/1/comm
systemd

创建systemd文件

[root@ceph-232 ~]# touch /var/lib/ceph/mon/ceph-ceph-232/systemd
[root@ceph-232 ~]# chown ceph:ceph /var/lib/ceph/mon/ceph-ceph-232/systemd

启动mon

[root@ceph-232 ~]# systemctl enable ceph.target
[root@ceph-232 ~]# systemctl enable ceph-mon@ceph-232
[root@ceph-232 ~]# systemctl start ceph-mon@ceph-232

查看mon状态

[root@ceph-232 ~]# ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-232.asok mon_status
{
    "name": "ceph-232",
    "rank": 1,
    "state": "peon",
    "election_epoch": 6,
    "quorum": [
        0,
        1
    ],
    "outside_quorum": [],
    "extra_probe_peers": [],
    "sync_provider": [],
    "monmap": {
        "epoch": 4,
        "fsid": "e1d69cc3-f40c-44c7-a758-fdfd4e5ef43e",
        "modified": "2017-06-21 16:46:01.518137",
        "created": "2017-06-21 16:16:34.913008",
        "mons": [
            {
                "rank": 0,
                "name": "ceph-231",
                "addr": "192.168.217.231:6789\/0"
            },
            {
                "rank": 1,
                "name": "ceph-232",
                "addr": "192.168.217.232:6789\/0"
            }
        ]
    }
}

删除mon

如果集群只有一个mon则不能删除mon

[root@ceph-232 ~]# ceph --cluster=ceph -n mon. -k /var/lib/ceph/mon/ceph-ceph-232/keyring mon remove ceph-232
removing mon.ceph-232 at 192.168.217.232:6789/0, there will be 1 monitors


创建mon并初始化

创建mon

创建mon与前面【创建mon】相同

初始化

初始化,创建keys

client.admin

[root@ceph-231 ~]# /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-ceph-231/keyring auth get client.admin
exported keyring for client.admin
[client.admin]
        key = AQAEI0JZupXTFRAAmFF56vYMzKkzc5nxLit6mA==
        caps mds = "allow *"
        caps mon = "allow *"
        caps osd = "allow *"

client.bootstrap-mds

[root@ceph-231 ~]# /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-ceph-231/keyring auth get client.bootstrap-mds
exported keyring for client.bootstrap-mds
[client.bootstrap-mds]
        key = AQBjzklZ5D6oLRAAMSwQ169JjzNnPBMzIv6vCw==
        caps mon = "allow profile bootstrap-mds"

如果client.bootstrap-mds在keyring中不存在,则创建

[root@ceph-231 ~]# /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-ceph-231/keyring auth get-or-create client.bootstrap-mds mon "allow profile bootstrap-mds"

client.bootstrap-osd

[root@ceph-231 ~]# /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-ceph-231/keyring auth get client.bootstrap-osd
exported keyring for client.bootstrap-osd
[client.bootstrap-osd]
        key = AQBkzklZf/NpKhAAeJwteZDEGfZj66BUnbxC1Q==
        caps mon = "allow profile bootstrap-osd"

如果client.bootstrap-osd在keyring中不存在,则创建

[root@ceph-231 ~]# /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-ceph-231/keyring auth get-or-create client.bootstrap-osd mon "allow profile bootstrap-osd"

client.bootstrap-rgw

[root@ceph-231 ~]# /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-ceph-231/keyring auth get client.bootstrap-rgw
exported keyring for client.bootstrap-rgw
[client.bootstrap-rgw]
        key = AQBlzklZT3BEIRAA9w+/G+Sp6zJ+aTh+VQwTUQ==
        caps mon = "allow profile bootstrap-rgw"

如果client.bootstrap-rgw在keyring中不存在,则创建

[root@ceph-231 ~]# /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-ceph-231/keyring auth get-or-create client.bootstrap-rgw mon "allow profile bootstrap-rgw"

mon文件结构

mon目录,比如/var/lib/ceph/mon/ceph-ceph-231有四个文件

  • done: 空白文件,是否创建完成的标志
  • keyring: mon的keys
  • store.db:mon数据库文件
  • systemd:空白文件,init的标志文件(如果Centos上的init为sysvinit,那么这个文件名称就是sysvinit)
 类似资料: