当前位置: 首页 > 知识库问答 >
问题:

Kubernetes如何使用带有度量服务器的HPA自动缩放?

章哲彦
2023-03-14

我对在Ubuntu安装中测试kubernete自动缩放解决方案非常感兴趣。我已经在mini kube中使用过它,带有heapster,但由于它已经被弃用,我尝试使用指标服务器。现在在我的Ubuntu中,我安装了metrics-server,如下所示:

kube-system      kube-apiserver-kmaster                  1/1     Running   1          11d
kube-system      kube-controller-manager-kmaster         1/1     Running   1          11d
kube-system      kube-proxy-47k6b                        1/1     Running   0          11d
kube-system      kube-proxy-q8zdw                        1/1     Running   1          11d
kube-system      kube-scheduler-kmaster                  1/1     Running   1          11d
kube-system      kubernetes-dashboard-5f7b999d65-6wl6k   1/1     Running   1          11d
kube-system      metrics-server-548456b4cd-wxc9b         1/1     Running   0          3d18h
metallb-system   controller-cd8657667-ckpn6              1/1     Running   0          8d
metallb-system   speaker-m9599   

但当我检查HPA时,我总是看到以下内容:

Kubectl获得hpa

NAME         REFERENCE               TARGETS         MINPODS   MAXPODS   REPLICAS   AGE
api-server   Deployment/api-server   <unknown>/50%   1         10        3          3d19h
ngsc         Deployment/ngsc         <unknown>/50%   1         10        3          3d19h

似乎度量服务不用于计算使用量。

我去了Kubernetes的文档站点,真的不知道如何配置metric服务器的利用率,以便Kubernetes进行自动缩放。

我描述了自动缩放:

                        api-server
Namespace:                                             default
Labels:                                                <none>
Annotations:                                           <none>
CreationTimestamp:                                     Fri, 03 May 2019 05:49:07 +0000
Reference:                                             Deployment/api-server
Metrics:                                               ( current / target )
  resource cpu on pods  (as a percentage of request):  <unknown> / 50%
Min replicas:                                          1
Max replicas:                                          10
Deployment pods:                                       3 current / 0 desired
Conditions:
  Type           Status  Reason                   Message
  ----           ------  ------                   -------
  AbleToScale    True    SucceededGetScale        the HPA controller was able to get the target's current scale
  ScalingActive  False   FailedGetResourceMetric  the HPA was unable to compute the replica count: unable to get metrics for resource cpu: no metrics returned from resource metrics API
Events:
  Type     Reason                   Age                        From                       Message
  ----     ------                   ----                       ----                       -------
  Warning  FailedGetResourceMetric  4m48s (x22069 over 3d20h)  horizontal-pod-autoscaler  unable to get metrics for resource cpu: no metrics returned from resource metrics API

描述部署:

Pod Template:
  Labels:  app=api-server
  Containers:
   api-server:
    Image:      xxxxxx
    Port:       <none>
    Host Port:  <none>
    Limits:
      cpu:  500m
    Requests:
      cpu:        200m
    Environment:  <none>
    Mounts:       <none>
  Volumes:        <none>

这意味着部署具有资源配置。但hpa仍显示未知

添加内存,现在描述的是:

 Limits:
      cpu:     500m
      memory:  1Gi
    Requests:
      cpu:        500m
      memory:     512Mi

但是kubectl获得hpa仍然未知。

正在检查metrics服务器的日志:

 1 manager.go:111] unable to fully collect metrics: [unable to fully scrape metrics from source kubelet_summary:kmaster: unable to fetch metrics from Kubelet kmaster (kmaster): Get https://kmaster:10250/stats/summary/: dial tcp: lookup kmaster on 10.96.0.10:53: no such host, unable to fully scrape metrics from source kubelet_summary:knode: unable to fetch metrics from Kubelet knode (knode): Get https://knode:10250/stats/summary/: dial tcp: lookup knode on 10.96.0.10:53: no such host]
E0507 05:20:23.797590       1 reststorage.go:148] unable to fetch pod metrics for pod default/api-server-777b78ccf5-mlt94: no metrics known for pod
E0507 05:20:23.797614       1 reststorage.go:148] unable to fetch pod metrics for pod default/api-server-777b78ccf5-r66bw: no metrics known for pod

以及何时

curl -k https://knode:10250/stats/summary/`

我遇到了以下错误:

Unauthorized

共有3个答案

沃楷
2023-03-14

这意味着pod没有分配给它们的任何cpu资源。没有分配的资源,HPA无法做出扩展决策。尝试向pod添加一些资源,如下所示:

spec:
  containers:
  - resources:
      requests:
        memory: "64Mi"
        cpu: "250m"
      limits:
        memory: "128Mi"
        cpu: "500m"
干稳
2023-03-14

有时,HPA没有显示值确保metrics服务器的pod在命名空间kube系统内运行。

对我来说,有时当流量出现在网站上时,它会开始显示HPA的值。

须景胜
2023-03-14

根据您提供的信息。

由于您有pod metrics-server-548456b4cd-wxc9b,这意味着metric server已启用。另外,由于您有3个副本,我假定此数字是在部署清单中提供的。

由于以下原因,HPA可能无法扩展您的部署:

1)缺乏资源

$ kubectl describe node
...
 Namespace                  Name                                 CPU Requests  CPU Limits  Memory Requests  Memory Limits  AGE
  ---------                  ----                                 ------------  ----------  ---------------  -------------  ---
  default                    nginx-deployment-5ffb677f99-k5mdj    200m (10%)    500m (25%)  0 (0%)           0 (0%)         6m55s
  default                    nginx-deployment-5ffb677f99-n7t7n    200m (10%)    500m (25%)  0 (0%)           0 (0%)         6m55s
  default                    nginx-deployment-5ffb677f99-pw2g7    200m (10%)    500m (25%)  0 (0%)           0 (0%)         6m55s
  kube-system                etcd-minikube                        0 (0%)        0 (0%)      0 (0%)           0 (0%)         152m
  kube-system                kube-addon-manager-minikube          5m (0%)       0 (0%)      50Mi (0%)        0 (0%)         152m
  kube-system                kube-apiserver-minikube              250m (12%)    0 (0%)      0 (0%)           0 (0%)         152m
  kube-system                kube-controller-manager-minikube     200m (10%)    0 (0%)      0 (0%)           0 (0%)         152m
  kube-system                kube-dns-6bfbdd666c-l74lx            260m (13%)    0 (0%)      110Mi (1%)       170Mi (2%)     32m
  kube-system                kube-proxy-dnh4m                     0 (0%)        0 (0%)      0 (0%)           0 (0%)         153m
  kube-system                kube-scheduler-minikube              100m (5%)     0 (0%)      0 (0%)           0 (0%)         152m
  kube-system                metrics-server-77fddcc57b-mjlf5      0 (0%)        0 (0%)      0 (0%)           0 (0%)         147m
  kube-system                storage-provisioner                  0 (0%)        0 (0%)      0 (0%)           0 (0%)         153m
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource           Requests     Limits
  --------           --------     ------
  cpu                1415m (70%)  1500m (75%)
  memory             160Mi (2%)   170Mi (2%)
  ephemeral-storage  0 (0%)       0 (0%)

正如您在示例中看到的那样,mini kube资源和3个带有nginx的pod已经请求了70%的CPU。在您的清单中,每个容器都将请求cpu:200m,因此此部署只能再创建2个pod。由于缺乏CPU资源,其他pod将处于Pend状态。

2)CPU负载不足

错误消息,如HPA无法计算副本计数:无法获取资源cpu的指标:没有从资源指标API返回的指标,意味着metric server没有收到任何指标,因此POD没有生成任何负载。

我假设您使用命令扩展了部署

$ kubectl autoscale deployment api-server --cpu-percent=50 --min=1 --max=10
...
Conditions:
  Type           Status  Reason                   Message
  ----           ------  ------                   -------
  AbleToScale    True    SucceededGetScale        the HPA controller was able to get the target's current scale
  ScalingActive  False   FailedGetResourceMetric  the HPA was unable to compute the replica count: unable to get metrics for resource cpu: no metrics returned from resource metrics API
Events:
  Type     Reason                        Age   From                       Message
  ----     ------                        ----  ----                       -------
  Warning  FailedGetResourceMetric       9s    horizontal-pod-autoscaler  unable to get metrics for resource cpu: no metrics returned from resource metrics API
  Warning  FailedComputeMetricsReplicas  9s    horizontal-pod-autoscaler  failed to get cpu utilization: unable to get metrics for resource cpu: no metrics returned from resource metrics API

尝试通过输入一个部署吊舱来生成一些CPU负载

$ kubectl exec -ti <yourPodName> sh

$ while true; do echo 'IncreaseLoad'; done
IncreaseLoad
IncreaseLoad
IncreaseLoad
...

也可以使用“应力”命令。

一段时间后,HPA应该获得指标并从正确的值更改。

Conditions:
  Type            Status  Reason              Message
  ----            ------  ------              -------
  AbleToScale     True    ReadyForNewScale    recommended size matches current size
  ScalingActive   True    ValidMetricFound    the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)
  ScalingLimited  False   DesiredWithinRange  the desired count is within the acceptable range
Events:
  Type     Reason                        Age                From                       Message
  ----     ------                        ----               ----                       -------
  Warning  FailedGetResourceMetric       14m (x6 over 16m)  horizontal-pod-autoscaler  unable to get metrics for resource cpu: no metrics returned from resource metrics API
  Warning  FailedComputeMetricsReplicas  14m (x6 over 16m)  horizontal-pod-autoscaler  failed to get cpu utilization: unable to get metrics for resource cpu: no metrics returned from resource metrics API
  Normal   SuccessfulRescale             6m54s              horizontal-pod-autoscaler  New size: 2; reason: All metrics below target
  Normal   SuccessfulRescale             50s                horizontal-pod-autoscaler  New size: 4; reason: cpu resource utilization (percentage of request) above target

如果这没有帮助,请提供您的HPA和部署清单。

 类似资料:
  • 我有一个kubernetes V1.12.1集群运行我的一些工作负载。我想设置HPA,这样我就可以根据来自普罗米修斯节点导出器的度量标准来衡量一个特定的吊舱。

  • 我试图通过Kubernetes Cloudwatch适配器,基于自定义的Cloudwatch度量来启用AWS EKS自动伸缩。我已经将自定义指标推送到AWS Cloudwatch,并验证了它们会出现在Cloudwatch控制台中,并且可以使用boto3客户机get_metric_data进行检索。这是我用来将自定义度量发布到CloudWatch的代码: 我有以下yaml文件,用于在Kubernet

  • 在使用CoreOS Prometheus运算符刮除所有标准集群度量的Kubernetes集群上,对于简单的HPA(水平pod自动缩放器),什么Prometheus度量会向我显示值? 如果我设置一个简单的hpa,比如: 然后,如果我执行,我会看到如下内容: 我想在普罗米修斯中看到。我做了一堆普罗米修斯查询来找这个。

  • 从库伯内特斯v1.18开始,v2beta2 API允许通过水平Pod Autoscalar(HPA)行为字段配置缩放行为。我计划将具有自定义指标的HPA应用于StatefulSet。 我正在查看的用例是使用自定义指标(例如,我的应用程序上的用户会话数量)进行扩展,但HPA根本不会缩减。K8s SIG-Autoscaling增强功能也描述了此用例-“HPA的可配置缩放速度 用户会话可以在几分钟到几小

  • 我的要求是在自定义指标上扩展POD,如队列中的挂起消息,PODS必须增加以处理作业。在kubernetes,Scale up在普罗米修斯适配器和普罗米修斯操作员中工作得很好。 我在pods中有长时间运行的进程,但HPA检查自定义度量并试图缩小规模,因为这个进程杀死了操作的中间并丢失了消息。我如何控制HPA只杀死没有进程运行的自由豆荚。 序列查询:‘{namespace=“default”,serv

  • 我们有一个Istio集群,我们正在尝试为Kubernetes配置水平pod自动Scale。我们希望使用请求计数作为HPA的自定义度量。我们如何利用伊斯蒂奥的普罗米修斯来达到同样的目的?