savanna为openstack下面部署hadoop的组件,其原理如下:
savanna启动步骤:
1、 service/api.pycreate_cluster(values)
a) plugin.validate(cluster):验证集群中各功能节点的数目
b) _provison_cluster
2、 对应1.b service/api.py _provision_cluster
a) plugin.update_infra(cluster):没有任何动作,直接pass
b) instance.create_cluster(cluster)
c) plugin.configure_cluster(cluster)
d) plugin.start_cluster(cluster)
3、 对应2.b service/instances.pycreate_cluster
a) “Spawning” _create_instances(cluster)
i. _generate_user_data_script 密钥对
ii. _run_instance 启动虚拟机,讲密钥对注入虚拟机
b) “waiting” _await_instances(cluster)
i. _check_if_up(instance)
1. remote.execute_command(“hostname”)
c) volumes.attach(cluster)
d) “preparing”__configure_instance(cluster)
i. _generate_etc_hosts(cluster) 产生主机名和ip的对应关系
ii. remote.write_file_to(‘etc-hosts’,hosts)
iii. remote.execute_command(‘sudo mv etc-hosts/etc/hosts’)
iv. remote.execute_command(‘chmod 400 .ssh/id_rsa’)
4、 对应2.c plugins/vanilla/plugin.py configure_cluster
a) __extract_configs(cluster)
b) __push_configs_to_nodes(cluster)
c) __write_hadoop_user_keys(cluster.private_key,utils.get_instances(cluster))
5、 对应4.b plugins/vanilla/plugin.py
a) files={core-site.xml,mapred-site.xml,hdf-site.xml,savanna-hadoop-init.sh}
b) sudo chown –R$USER:$USER /etc/hadoop
c) write_files_to(files)
d) sudo chmod 0500/tmp/savanna-hadoop-init.sh
e) sudo/tm/savanna-hadoop-init.sh >> /tmp/savanna-hadoop-init.log 2>&1
f) 针对namenode:write_file_to(‘/etc/hadoop/dn.incl’,utils.generate_fqdn_host_names(utils.get_datanodes(cluster)))
g) 针对jobtracker:write_file_to(‘/etc/hadoop/tt.incl’,utils.generate_fqdn_host_names(utils.get_tasktrackers(cluster)))
6、 对应4.c plugins/vanilla/plugin.py
a) id_rsa:private_key,authorized_keys:public_key
b) write_files_to(key)
c) sudo mkdir –p/home/hadoop/.ssh/
d) sudo mv id_rsaauthorized_keys /home/hadoop/.shh
e) sudo chown –Rhadoop:hadoop /home/hadoop/.shh
f) sudo chmod 600/home/hadoop/.shh/{id_rsa,authorized_keys}
7、 对应2.dplugins/vanilla/plugin.py start_cluster 通过调用plugins/vanilla/run_scripts.py中的函数来远程执行:
a) run.format_namenode(remote)
i. sudo su –c ‘hadoop namenode –format’ hadoop
b) run.start_process(remote,”namenode”)
i. sudo su –c “/usr/sbin/hadoop-daemon.sh startnamenode” hadoop
c) run.start_process(remote,”secondarynamenode”)
d) run.start_process(dn.remote,”datanode”)
e) run.start_process(jt_instance.temote,”jobtracker”)
f) run.start_process(tt.remote,”tasktracker”)
g) __set_cluster_info(cluster)启动成功,设置县官信息。