当前位置: 首页 > 工具软件 > Ubuntu Juju > 使用案例 >

ubuntu20.04下使用juju+maas环境部署k8s-14-删除并新建etcd节点

丁鹏鹍
2023-12-01

参考文档:

etcd charm

etcdctl报错x509

ETCD 使用

使用ubuntu charmed kubernetes 部署一套生产环境的集群

因为etcd/2在安装时使用了错误的节点,所以想将错误节点的etcd删除并重建。

删除节点

juju remove-unit etcd/2 --force --no-wait

重建节点

juju add-machine --constraints tags=etcd
machine 13
juju add-unit etcd --to 13

显示状态

juju status
etcd/0*                   active    idle   1        10.0.4.139      2379/tcp        UnHealthy with 3 known peers
etcd/3                    waiting   idle   17       10.0.4.153                      Waiting to retry etcd registration

研究了下,应该是etcd的节点信息中原来的etcd/2没有除去,所以新增节点增加不进去。

获取凭证信息:

juju run-action --wait etcd/0 package-client-credentials
juju scp etcd/0:etcd_credentials.tar.gz etcd_credentials.tar.gz

解压:

tar  -zxvf  etcd_credentials.tar.gz
etcd_credentials/
etcd_credentials/ca.crt
etcd_credentials/README.txt
etcd_credentials/client.crt
etcd_credentials/client.key

转移到/root/etcd_credentials/目录

cd /root/etcd_credentials/

手工输入环境变量,因为etcd版本为3.4.5 ,所以格式如下:

juju expose etcd
export ETCDCTL_KEY=$(pwd)/client.key
export ETCDCTL_CERT=$(pwd)/client.crt
export ETCDCTL_CACERT=$(pwd)/ca.crt
export ETCDCTL_API=3  #否则会出现509错误
export ETCDCTL_ENDPOINT=https://10.0.4.139:2379       # etcd/0的ip

列出成员清单:

etcdctl member list

出现了如下错误:

Error:  dial tcp 127.0.0.1:2379: connect: connection refused

查了下命令,原来在现在的k8s中,etcdctl命令格式变了,需要增加端点参数 --endpoints=https://10.0.4.139:2379

#10.0.4.139为处于leadership的etcd节点IP,目前为etcd/0节点。

如检验端点健康状态:

 etcdctl --endpoints=https://10.0.4.139:2379 endpoint health

https://10.0.4.139:2379 is healthy: successfully committed proposal: took = 8.609559ms

列出成员名单:

 etcdctl --endpoints=https://10.0.4.139:2379 member list
bb605e8c9ebece4, started, etcd0, https://10.0.4.139:2380, https://10.0.4.139:2379
54bba7baf27ccef7, started, etcd1, https://10.0.4.140:2380, https://10.0.4.140:2379
defc4e8a9c8f25bc, started, etcd2, https://10.0.4.145:2380, https://10.0.4.145:2379

其中defc4e8a9c8f25bc, started, etcd2, https://10.0.4.145:2380, https://10.0.4.145:2379,就是已经删除的etcd节点,需要etcdctl删除。

删除多余的etcd节点:

 etcdctl --endpoints=https://10.0.4.139:2379 member remove defc4e8a9c8f25bc
Member defc4e8a9c8f25bc removed from cluster 22b26385f89f7fa8

在过一会儿,新增的etcd节点已经添加到etcd集群中了

juju status
etcd/0*                   active    idle   1        10.0.4.139      2379/tcp        Healthy with 2 known peers
  filebeat/2              active    idle            10.0.4.139                      Filebeat ready.
etcd/1                    active    idle   2        10.0.4.140      2379/tcp        Healthy with 3 known peers
  filebeat/1              active    idle            10.0.4.140                      Filebeat ready.
etcd/4                    active    idle   17       10.0.4.153      2379/tcp        Healthy with 3 known peers
  filebeat/10             active    idle            10.0.4.153                      Filebeat ready.
 类似资料: