感谢朋友支持本博客,欢迎共同探讨交流,由于能力和时间有限,错误之处在所难免,欢迎指正!
如果转载,请保留作者信息。
博客地址:http://blog.csdn.net/gaoxingnengjisuan
邮箱地址:dong.liu@siat.ac.cn
PS:最近没有登录博客,很多朋友的留言没有看见,这里道歉!还有就是本人较少上QQ,可以邮件交流。
由于各种原因,近两个多月没有写博客了,发现以前读源码时领会的东西,渐渐都忘了,所以打算恢复多记录多回顾这个习惯。从这篇博客开始,我将把以前读swift源码过程中领会的东西简单整理一下(之前都只是写在源码注释里面了),不奢求能给大家带来帮助,只是自己的一个记录吧,便于以后回顾之用!理解的错误之处在所难免,希望大家谅解!
概述部分:
这个脚本实现命令行指定账户或容器或对象的审计验证操作;
根据具体参数情况实现操作:
指定object的审计验证;
指定container的审计验证,并实现递归验证container下每个object;
指定account的审计验证,并实现递归验证account下每个container,并且进一步实现递归验证container下每个object;
Examples:
/usr/bin/swift-account-audit SOSO_88ad0b83-b2c5-4fa1-b2d6-60c597202076
/usr/bin/swift-account-audit SOSO_88ad0b83-b2c5-4fa1-b2d6-60c597202076/container/object
/usr/bin/swift-account-audit -e errors.txt SOSO_88ad0b83-b2c5-4fa1-b2d6-60c597202076/container
/usr/bin/swift-account-audit < errors.txt
/usr/bin/swift-account-audit -c 25 -d < errors.txt
这个服务并不是一个守护进程,只有命令行中有/usr/bin/swift-account-audit之后,就会调用这个脚本;
源码解析部分:
if __name__ == '__main__':
try:
optlist, args = getopt.getopt(sys.argv[1:], 'c:r:e:d')
except getopt.GetoptError as err:
print str(err)
print usage
sys.exit(2)
if not args and os.isatty(sys.stdin.fileno()):
print usage
sys.exit()
opts = dict(optlist)
options = {
'concurrency': int(opts.get('-c', 50)),
'error_file': opts.get('-e', None),
'swift_dir': opts.get('-r', '/etc/swift'),
'deep': '-d' in opts,
}
auditor = Auditor(**options)
if not os.isatty(sys.stdin.fileno()):
args = chain(args, sys.stdin)
# 这个循环说明可以在一个命令行中同时进行多个目标的审计验证操作;
for path in args:
path = '/' + path.rstrip('\r\n').lstrip('/')
# 根据具体参数情况实现操作:
# 指定object的审计验证;
# 指定container的审计验证,并实现递归验证container下每个object;
# 指定account的审计验证,并实现递归验证account下每个container,并且进一步实现递归验证container下每个object;
auditor.audit(*split_path(path, 1, 3, True))
auditor.wait()
auditor.print_stats()
1.命令行选项处理;
转到3,来看方法audit:
def audit(self, account, container=None, obj=None):
"""
根据具体参数情况实现操作:
指定object的审计验证;
指定container的审计验证,并实现递归验证container下每个object;
指定account的审计验证,并实现递归验证account下每个container,并且进一步实现递归验证container下每个object;
"""
# 指定object的审计验证;
if obj and container:
self.pool.spawn_n(self.audit_object, account, container, obj)
# 指定container的审计验证,并实现递归验证container下每个object;
elif container:
self.pool.spawn_n(self.audit_container, account, container, True)
# 指定account的审计验证,并实现递归验证account下每个container,并且进一步实现递归验证container下每个object;
else:
self.pool.spawn_n(self.audit_account, account, True)
3.1 audit_object方法实现指定object的审计验证;
转到3.1,来看方法audit_object的实现:
def audit_object(self, account, container, name):
"""
指定object的审计验证;
"""
# 获取指定account和container下的对象具体路径;
path = '/%s/%s/%s' % (account, container, name)
# 获取指定name对象的所有副本的相关节点和分区号;
# 获取account/container/object所对应的分区号和节点(可能是多个,因为分区副本有多个,可能位于不同的节点上);
# 返回元组(分区,节点信息列表);
# 在节点信息列表中至少包含id、weight、zone、ip、port、device、meta;
part, nodes = self.object_ring.get_nodes(account, container.encode('utf-8'), name.encode('utf-8'))
# 获取指定account和container下的对象列表;
container_listing = self.audit_container(account, container)
consistent = True
if name not in container_listing:
print " Object %s missing in container listing!" % path
consistent = False
hash = None
else:
hash = container_listing[name]['hash']
etags = []
#查询每个节点上指定part的信息;
for node in nodes:
try:
if self.deep:
# 获取到服务的连接;
conn = http_connect(node['ip'], node['port'], node['device'], part, 'GET', path, {})
resp = conn.getresponse()
calc_hash = md5()
chunk = True
while chunk:
chunk = resp.read(8192)
calc_hash.update(chunk)
calc_hash = calc_hash.hexdigest()
if resp.status // 100 != 2:
self.object_not_found += 1
consistent = False
print ' Bad status GETting object "%s" on %s/%s' % (path, node['ip'], node['device'])
continue
if resp.getheader('ETag').strip('"') != calc_hash:
self.object_checksum_mismatch += 1
consistent = False
print ' MD5 does not match etag for "%s" on %s/%s' % (path, node['ip'], node['device'])
etags.append(resp.getheader('ETag'))
else:
conn = http_connect(node['ip'], node['port'],
node['device'], part, 'HEAD',
path.encode('utf-8'), {})
resp = conn.getresponse()
if resp.status // 100 != 2:
self.object_not_found += 1
consistent = False
print ' Bad status HEADing object "%s" on %s/%s' % (path, node['ip'], node['device'])
continue
etags.append(resp.getheader('ETag'))
except Exception:
self.object_exceptions += 1
consistent = False
print ' Exception fetching object "%s" on %s/%s' % (path, node['ip'], node['device'])
continue
if not etags:
consistent = False
print " Failed fo fetch object %s at all!" % path
elif hash:
for etag in etags:
if resp.getheader('ETag').strip('"') != hash:
consistent = False
self.object_checksum_mismatch += 1
print ' ETag mismatch for "%s" on %s/%s' % (path, node['ip'], node['device'])
if not consistent and self.error_file:
print >>open(self.error_file, 'a'), path
self.objects_checked += 1
3.1.1 获取指定account和container下的对象具体路径;
本想把内容都写在一篇博客中,但是几次下来发现篇幅太长真的难以维护格式,所以只能分开多篇来实现了!