1、storageclass参数调整
rook/cluster/examples/kubernetes/ceph/csi/rbd/storageclass.yaml
apiVersion: ceph.rook.io/v1
kind: CephBlockPool
metadata:
name: replicapool #rbd pool名称根据需求修改
namespace: rook-ceph
spec:
failureDomain: host #默认是以host为级别进行故障隔离
replicated: #采用3副本方式保障数据可用性
size: 3
requireSafeReplicaSize: true
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: rook-ceph-block
provisioner: rook-ceph.rbd.csi.ceph.com
parameters:
pool: replicapool #使用rbd的pool名称和CephBlockPool的name字段保持一致
imageFormat: "2"
imageFeatures: layering
csi.storage.k8s.io/provisioner-secret-name: rook-csi-rbd-provisioner
csi.storage.k8s.io/controller-expand-secret-name: rook-csi-rbd-provisioner
csi.storage.k8s.io/node-stage-secret-name: rook-csi-rbd-node
csi.storage.k8s.io/fstype: ext4 #还可以选则xfs
allowVolumeExpansion: true
reclaimPolicy: Delete
2、部署storageclass
cd ~/rook/cluster/examples/kubernetes/ceph/csi/rbd
kubectl apply -f storageclass.yaml
3、部署wordpress验证storageclass功能
cd ~/rook/cluster/examples/kubernetes
kubectl apply -f mysql.yaml -f wordpress.yaml
kubectl get pvc
kubectl get pv
kubectl -n rook-ceph exec -it rook-ceph-tools-6dd46f6946-p66xl bash
ceph osd lspool
rbd -p replicapool ls
1、修改ceph fs部署参数,其余资源限制、调度等参数
rook/cluster/examples/kubernetes/ceph/filesystem.yaml
metadataServer:
activeCount: 2 #保证mds为2主2备,实现服务高可用
activeStandby: true
2、配置mds服务的node亲和性,由于mds服务消耗cpu资源较多,建议规划专用节点以保障服务在专用机器运行
kubectl label node rook01 ceph-mds=enable
kubectl label node rook02 ceph-mds=enable
kubectl label node rook03 ceph-mds=enable
kubectl label node rook04 ceph-mds=enable
placement:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: ceph-mds
operator: In
values:
- enable
3、部署ceph fs
kubectl apply -f rook/cluster/examples/kubernetes/ceph/filesystem.yaml
4、部署基于ceph fs的storageclass和在k8s中调用ceph fs storageclass
cd rook/cluster/examples/kubernetes/ceph/csi/cephfs
kubectl apply -f storageclass.yaml
kubectl apply -f kube-registry.yaml
kubectl -n kube-system get pvc|grep cephfs
5、外部使用ceph fs挂载方式
mon_endpoints=$(grep mon_host /etc/ceph/ceph.conf | awk '{print $3}')
my_secret=$(grep key /etc/ceph/keyring | awk '{print $3}')
# Mount the filesystem
mount -t ceph -o mds_namespace=myfs,name=admin,secret=$my_secret $mon_endpoints:/ /tmp/registry
6、ceph fs运维关注点
csi-attacher 挂载
csi-snapshotter 快照
csi-resizer 调整⼤⼩
csi-provisioner 创建
csi-cephfsplugin 驱动agent
1、修改ceph rgw部署参数,部署高可用rgw集群。cobject.yaml
instances: 2
placement:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: ceph-rgw
operator: In
values:
- enabled
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app
operator: In
values:
- rook-ceph-rgw
topologyKey: kubernetes.io/hostname
2、配置rgw服务的node亲和性,建议规划专用节点以保障服务在专用机器运行
kubectl label node rook01 ceph-rgw=enabled
kubectl label node rook02 ceph-rgw=enabled
kubectl label node rook03 ceph-rgw=enabled
3、rook管理外部rgw集群(可选)
rook/cluster/examples/kubernetes/ceph/object-external.yaml
4、创建bucket(通过k8s创建storageclass方式)
#创建storageclass
kubectl apply -f rook/cluster/examples/kubernetes/ceph/storageclass-bucket-delete.yaml
#创建bucket验证
kubectl apply -f rook/cluster/examples/kubernetes/ceph/object-bucket-claim-delete.yaml
#查看objectbucketclaims
kubectl get objectbucketclaims.objectbucket.io
radosgw-admin bucket list
5、容器内访问对象存储
1>获取访问信息
#获取访问地址
kubectl get cm ceph-delete-bucket -o yaml -o jsonpath='{.data.BUCKET_HOST}'|awk '{print $1":80"}'
kubectl get cm ceph-delete-bucket -o yaml -o jsonpath='{.data.BUCKET_HOST}'|awk '{print $1":80/%(bucket)"}'
#获取AK/SK
kubectl get secrets ceph-delete-bucket -o yaml -o jsonpath='{.data.AWS_ACCESS_KEY_ID}'|base64 -d
kubectl get secrets ceph-delete-bucket -o yaml -o jsonpath='{.data.AWS_SECRET_ACCESS_KEY}'|base64 -d
2>容器内配置客户端
kubectl -n rook-ceph exec -it rook-ceph-tools-6dd46f6946-j6qnp bash
yum -y install s3cmd
s3cmd --configure
Access Key: 8014201KV9K0E6RFR7OW
Secret Key: gYBzNuIRxXpSy9N24V4slKC7Lm6YBTNwsQdNucBv
Default Region: US
S3 Endpoint: rook-ceph-rgw-my-store.rook-ceph.svc:80
DNS-style bucket+hostname:port template for accessing a bucket: rook-ceph-rgw-my-store.rook-ceph.svc:80/%(bucket)
Encryption password:
Path to GPG program: /usr/bin/gpg
Use HTTPS protocol: False
HTTP Proxy server name:
HTTP Proxy server port: 0
3>上传下载测试
s3cmd ls
s3cmd put /etc/passwd* s3://ceph-bkt-84449e86-0fcb-4436-bd98-dbcc4e4bde0e
s3cmd ls s3://ceph-bkt-84449e86-0fcb-4436-bd98-dbcc4e4bde0e
s3cmd get s3://ceph-bkt-84449e86-0fcb-4436-bd98-dbcc4e4bde0e/passwd
6、外部访问rook对象存储
1>暴露nodePort或者ingress并查看nodePort端口
kubectl apply -f rook/cluster/examples/kubernetes/ceph/rgw-external.yaml
2>配置客户端
yum -y install s3cmd
s3cmd --configure
New settings:
Access Key: 8014201KV9K0E6RFR7OW
Secret Key: gYBzNuIRxXpSy9N24V4slKC7Lm6YBTNwsQdNucBv
Default Region: US
S3 Endpoint: 192.168.86.36:32172
DNS-style bucket+hostname:port template for accessing a bucket: 192.168.86.36:32172/%(bucket)
Encryption password:
Path to GPG program: /usr/bin/gpg
Use HTTPS protocol: False
HTTP Proxy server name:
HTTP Proxy server port: 0
7、创建单独访问用户访问
1>创建用户
kubectl apply -f rook/cluster/examples/kubernetes/ceph/object-user.yaml
2>获取用户信息
kubectl -n rook-ceph get secrets rook-ceph-object-user-my-store-my-user -o yaml -o jsonpath='{.data.AccessKey}'|base64 -d
kubectl -n rook-ceph get secrets rook-ceph-object-user-my-store-my-user -o yaml -o jsonpath='{.data.SecretKey}'|base64 -d
kubectl -n rook-ceph get secrets rook-ceph-object-user-my-store-my-user -o yaml -o jsonpath='{.data.Endpoint}'|base64 -d
3>参考上一步骤配置客户端
8、rgw 维护
1>关注rook-ceph-rgw-my-store容器状态,把pod打散,不要在通一个节点
2>保障有冗余节点,避免在节点故障时pod有节点可去
3>关注ceph -s中rgw状态
4>debug问题看rook-ceph-rgw-my-store的pod日志和rook-ceph-mgr的pod日志
1、扩容osd
1>直接加硬盘或者机器
#!/bin/sh
##新增硬盘后,使用此脚本自动发现新增硬盘,不需要重启服务器
scsihostnum=`ls -alh /sys/class/scsi_host/host*|wc -l`
for ((i=0;i<${scsihostnum};i++))
do
echo "- - -" > /sys/class/scsi_host/host${i}/scan
done
在rook/cluster/examples/kubernetes/ceph/cluster.yaml 对应节点添加盘符或者节点信息,应用这个yaml即可
kubectl apply -f
rook/cluster/examples/kubernetes/ceph/cluster.yaml
2、使用bluestore加速(未验证)
- name: "sdd"
config:
storeType: bluestore
metadataDevice: "/dev/sde"
databaseSizeMB: "4096"
walSizeMB: "4096"
3、删除osd
注意事项:
删除osd后确保集群有⾜够的容量
删除osd后确保PG状态正常
单次尽可能不要删除过多的osd
删除多个osd需要等待数据同步同步完毕后再执⾏(rebalancing)
1>通过k8s方式删除
kubectl -n rook-ceph scale deploy rook-ceph-osd-5 --replicas=0
修改:osd-purge.yaml中osd-isd字段的osd编号
args: ["ceph", "osd", "remove", "--preserve-pvc", "false", "--osd-ids", "5"]
kubectl apply -f rook/cluster/examples/kubernetes/ceph/osd-purge.yaml
kubectl -n rook-ceph delete deployments.apps rook-ceph-osd-5
2>手动删除
#将osd标识为out,此时会进⾏元数据的同步
ceph osd out osd.5
#删除out,此时会进⾏数据的搬迁,即backfilling和rebalancing动作,完成数据的迁移
ceph osd purge 5
#把对应deployment和cluster.yaml中的内容删除
kubectl -n rook-ceph delete deployments.apps rook-ceph-osd-5
4、替换osd
替换操作的思路是:将其从Ceph集群中删除—采⽤云原⽣⽅式或⼿动⽅式,删除之后数据同步完毕后再
通过扩容的⽅式添加回集群中,添加回来时候注意将对应的LVM删除