最近生产环境出现一个异常,设备系统卡一天写入量达到270多G,这样的压力下一个系统卡使用不了多久就会挂掉。os组同事使用blacktrace命令发现系统卡上有一个进程名称为rocksdb的进程在频繁的刷入数据,直接咬定这270G数据是我们ceph写入的。最终发现,ceph mon的数据库是放在系统卡里的,为了有效卸锅,特地调研了一下该工具(这个工具好像默认是不装的,需要单独安装一下)
NAME
ceph-kvstore-tool - ceph kvstore manipulation tool
SYNOPSIS
ceph-kvstore-tool <leveldb|rocksdb|bluestore-kv> <store path> command [args...]
DESCRIPTION
ceph-kvstore-tool is a kvstore manipulation tool. It allows users to manipule leveldb/rocksdb's data (like OSD's omap) offline.
COMMANDS
ceph-kvstore-tool utility uses many commands for debugging purpose which are as follows:
list [prefix]
Print key of all KV pairs stored with the URL encoded prefix.
list-crc [prefix]
Print CRC of all KV pairs stored with the URL encoded prefix.
exists <prefix> [key]
Check if there is any KV pair stored with the URL encoded prefix. If key is also specified, check for the key with the prefix instead.
get <prefix> <key> [out <file>]
Get the value of the KV pair stored with the URL encoded prefix and key. If file is also specified, write the value to the file.
crc <prefix> <key>
Get the CRC of the KV pair stored with the URL encoded prefix and key.
get-size [<prefix> <key>]
Get estimated store size or size of value specified by prefix and key.
set <prefix> <key> [ver <N>|in <file>]
Set the value of the KV pair stored with the URL encoded prefix and key. The value could be version_t or text.
rm <prefix> <key>
Remove the KV pair stored with the URL encoded prefix and key.
rm-prefix <prefix>
Remove all KV pairs stored with the URL encoded prefix.
store-copy <path> [num-keys-per-tx]
Copy all KV pairs to another directory specified by path. [num-keys-per-tx] is the number of KV pairs copied for a transaction.
store-crc <path>
Store CRC of all KV pairs to a file specified by path.
compact
Subcommand compact is used to compact all data of kvstore. It will open the database, and trigger a database's compaction. After compaction,
some disk space may be released.
compact-prefix <prefix>
Compact all entries specified by the URL encoded prefix.
compact-range <prefix> <start> <end>
Compact some entries specified by the URL encoded prefix and range.
ceph-kvstore-tool rocksdb /var/lib/ceph/mon/ceph-node1/store.db list [prefix]
注意:这里如果加上prefix,仅会列出一类表,如果不加则会将所有表打印出来。(这里其实疑惑的是,这里明明是leveldb,但是写成leveldb后会报错)
上述命令打印如下:
auth 251
auth 252
auth 253
auth 254
auth 255
auth 256
auth 257
auth 258
auth 259
auth 260
auth 261
auth 262
auth 263
auth 264
auth 265
auth 266
auth 267
auth 268
auth 269
......
上面的auth之类的就是命令里面的prefix,后面跟的数字就是key。
了解到这些之后,特别想知道环境中共有多少个表,以及每个表的数量分布
auth 256
health 9
logm 1461
mdsmap 4
mgr 127
mgr_command_descs 1
mgr_metadata 1
mgrstat 68
monitor 7
monitor_store 1
monmap 3
osdmap 1108
osd_metadata 5
osd_pg_creating 1
paxos 534
pgmap 5
由上面可以看到,大头主要是logm、osdmap、paxos这几个
知道主要的表有多少个后,特别想知道每个表里存储内容是什么,接下来分析一下:
ceph-kvstore-tool rocksdb /var/lib/ceph/mon/ceph-node1/store.db get paxos 2829123 out paxos_2829123.txt
这样就将表paxos中key为2829123 指定存到了paxos_2829123.txt。由于存下来的数据是经过编码的。这里需要使用到工具ceph-dencoder 将其解码。自己随手写了个脚本,可以直接打印出目前该工具能够解码的内容:
for i in `ceph-dencoder list_types`;do echo --------------$i-----------------;ceph-dencoder type $i import paxos_919916.txt decode dump_json 2>/dev/null ;done
--------------ACLGranteeType-----------------
{
"type": 4
}
--------------ACLOwner-----------------
--------------ACLPermission-----------------
{
"flags": 4
}
--------------AuthMonitor::Incremental-----------------
--------------BitVector<2>-----------------
--------------BloomHitSet-----------------
--------------Capability-----------------
--------------CompatSet-----------------
--------------CrushWrapper-----------------
--------------DBObjectMap::State-----------------
{
"seq": 11130360536498176
}
--------------DBObjectMap::_Header-----------------
--------------DecayCounter-----------------
--------------ECSubRead-----------------
--------------ECSubReadReply-----------------
--------------ECSubWrite-----------------
--------------ECSubWriteReply-----------------
--------------ECUtil::HashInfo-----------------
--------------ECommitted-----------------
--------------EExport-----------------
--------------EFragment-----------------
--------------EImportFinish-----------------
--------------EImportStart-----------------
--------------EMetaBlob-----------------
--------------EMetaBlob::dirlump-----------------
--------------EMetaBlob::fullbit-----------------
--------------EMetaBlob::nullbit-----------------
{
"dentry": "\u0002\u0001<8B>'",
"snapid.first": 7782220156163391488,
"snapid.last": 7349874592060368751,
"dentry version": 4051045260068678773,
"dirty": "true"
}
。。。。。。。
可以看到paxos中存储的内容还是比较多且杂的,基于同样的方法可以查看所有表中的内容
ceph-kvstore-tool rocksdb /var/lib/ceph/mon/ceph-node1/store.db get-size osdmap full_96
1、可以列出所有的表中key的crc校验和
2、可以将所有的校验和保存到指定目录
3、可以判断指定表或者指定表及其对应的key是否存在
4、设置指定prefix下key的value,可以通过文件方式或者变量赋值
5、可以删除表或者仅仅删除表对应的key
6、可以将所有的kv保存的指定目录下
7、compact所有内容,或者compact 指定表,或者compact指定表的指定版本范围
最后本文主要参考的是:
https://blog.csdn.net/scaleqiao/article/details/51946042
https://blog.csdn.net/qq_16327997/article/details/83303755
自己在这个基础上完善了一下