从kubernetes 1.8的版本开始,随着横向扩缩容功能的稳定和提升,HPA支持自定义指标,Cluster Autoscaler提升了性能与错误报告能力; 支持新版的HPA API,相关的API和组件升至稳定版本,比如: resource Metrics API、custom metrics API和metrics-server等。这意味着Metrics Server已经开始使用了。这篇文章介绍一下Metrics Server在Kubernetes上的部署方法。
诸如CPU和内存使用率等资源使用率指标都可以通过Metrics API来获取。用户可以直接通过kubectl top命令等方式直接获取这些指标信息(比如node和pod相关的指标信息)。虽然通过Metrics API可以获得node或者pod的资源使用量的信息,但是API并不会保存这些指标的数据,所以在指定的保存期间之外的时间的数据无法获取,只能结合时序列数据库来进行保存。
API Endpoint: /apis/metrics.k8s.io/
API定义
https://github.com/kubernetes/metrics/blob/master/pkg/apis/metrics/v1beta1/types.go
注意事项:API的使用需要集群部署Metrics Server,否则无法获取。
Metrics Server可以用来在集群范围可以进行资源使用状况信息的收集,它通过Kubelet收集在各个节点上提供的指标信息,Metrics Server的使用需要开启聚合层,所以需要在Api Server的启动选项中添加如下内容:
--requestheader-client-ca-file={{ var_ssl_ca_dir }}/{{ var_ssl_file_ca_pem }} \
--requestheader-allowed-names= \
--requestheader-extra-headers-prefix=X-Remote-Extra- \
--requestheader-group-headers=X-Remote-Group \
--requestheader-username-headers=X-Remote-User \
--proxy-client-cert-file={{ var_ssl_k8s_dir }}/{{ var_ssl_aggregator_cert_prefix }}.pem \
--proxy-client-key-file={{ var_ssl_k8s_dir }}/{{ var_ssl_aggregator_cert_prefix }}-key.pem \
--enable-aggregator-routing=true
注意事项:
Kubernetes 1.17.2,可参看下文进行快速环境搭建:
[root@host131 ansible]# kubectl get node -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
192.168.163.131 Ready <none> 54s v1.17.2 192.168.163.131 <none> CentOS Linux 7 (Core) 3.10.0-957.el7.x86_64 docker://19.3.5
[root@host131 ansible]#
[root@host131 ansible]# kubectl top node
Error from server (NotFound): the server could not find the requested resource (get services http:heapster:)
[root@host131 ansible]#
[root@host131 ansible]# kubectl top pod
Error from server (NotFound): the server could not find the requested resource (get services http:heapster:)
[root@host131 ansible]#
执行命令:git clone https://github.com/kubernetes-incubator/metrics-server
[root@host131 ~]# git clone https://github.com/kubernetes-incubator/metrics-server
Cloning into 'metrics-server'...
remote: Enumerating objects: 29, done.
remote: Counting objects: 100% (29/29), done.
remote: Compressing objects: 100% (20/20), done.
remote: Total 11484 (delta 10), reused 23 (delta 9), pack-reused 11455
Receiving objects: 100% (11484/11484), 12.24 MiB | 1.06 MiB/s, done.
Resolving deltas: 100% (5982/5982), done.
[root@host131 ~]#
进入到deploy/1.8+的目录下,根据需要修改如下设定文件(metrics-server-deployment.yaml):
[root@host131 ~]# cd metrics-server/deploy/1.8+/
[root@host131 1.8+]# ls
aggregated-metrics-reader.yaml auth-reader.yaml metrics-server-deployment.yaml resource-reader.yaml
auth-delegator.yaml metrics-apiservice.yaml metrics-server-service.yaml
[root@host131 1.8+]# ls -l metrics-server-deployment.yaml
-rw-r--r-- 1 root root 1183 Jan 31 17:50 metrics-server-deployment.yaml
[root@host131 1.8+]#
进行如下修改
[root@host131 1.8+]# cp -p metrics-server-deployment.yaml metrics-server-deployment.yaml.org
[root@host131 1.8+]# vi metrics-server-deployment.yaml
[root@host131 1.8+]# diff metrics-server-deployment.yaml metrics-server-deployment.yaml.org
44c44
< imagePullPolicy: IfNotPresent
---
> imagePullPolicy: Always
[root@host131 1.8+]#
保证本地存在metrics-server的镜像
[root@host131 1.8+]# grep image: metrics-server-deployment.yaml
image: k8s.gcr.io/metrics-server-amd64:v0.3.6
[root@host131 1.8+]# docker images |grep metrics-server
k8s.gcr.io/metrics-server-amd64 v0.3.6 9dd718864ce6 3 months ago 39.9MB
[root@host131 1.8+]#
注:直接可以获取镜像的情况下可以直接忽略此配置和设定
执行命令:kubectl create -f .
[root@host131 1.8+]# pwd
/root/metrics-server/deploy/1.8+
[root@host131 1.8+]# kubectl create -f .
clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader created
clusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator created
rolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader created
apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io created
serviceaccount/metrics-server created
deployment.apps/metrics-server created
service/metrics-server created
clusterrole.rbac.authorization.k8s.io/system:metrics-server created
clusterrolebinding.rbac.authorization.k8s.io/system:metrics-server created
[root@host131 1.8+]#
[root@host131 1.8+]# kubectl get pod -A |grep metrics
kube-system metrics-server-5cc8d5c4df-ms2ts 1/1 Running 0 31s
[root@host131 1.8+]# kubectl get deployment -A |grep metrics
kube-system metrics-server 1/1 1 1 40s
[root@host131 1.8+]# kubectl get service -A |grep metrics
kube-system metrics-server ClusterIP 10.254.197.79 <none> 443/TCP 47s
[root@host131 1.8+]#
[root@host131 1.8+]# kubectl top pod
[root@host131 1.8+]# kubectl top pod metrics-server-5cc8d5c4df-ms2ts -n kube-system
NAME CPU(cores) MEMORY(bytes)
metrics-server-5cc8d5c4df-ms2ts 1m 12Mi
[root@host131 1.8+]#
[root@host131 1.8+]# kubectl top node
error: metrics not available yet
[root@host131 1.8+]#
稍等一点时间就可以看到node资源的信息显示了
[root@host131 1.8+]# kubectl top node
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
192.168.163.131 96m 9% 2534Mi 65%
[root@host131 1.8+]#
ApiServer的设定权限导致的问题
https://liumiaocn.blog.csdn.net/article/details/104140697
镜像拉取的问题
https://liumiaocn.blog.csdn.net/article/details/104140713
https://kubernetes.io/docs/tasks/debug-application-cluster/resource-metrics-pipeline/