Argo workflow 作为云原生的workflow 引擎被大量应用于CI/CD , 基础设施自动化支持
一般来说按照argo workflow 提供的官方文档可以完成 入门体验,argo workflow依赖的部分镜像,有某些不可描述的原因无法正常访问到,本文记录了笔者在安装和入门过程中踩的坑,以便遇到同样问题的朋友可以快速解决问题。
1 前置条件
要安装argo workflow 前提条件是你要有一套k8s环境。可以是完整的k8s, 如果只本地练习,可以考虑安装一个minikube。
2 安装步骤
2.1 创建namespace
k8s中为了便于资源和服务管理,可以用namespace隔离资源和服务。我们把argo workflow相关的服务都部署到argo这个命名空间,方便后续的操作,查找和维护
kubectl create ns argo
2.2 创建argo workflow基础 服务
kubectl apply -n argo -f https://raw.githubusercontent.com/argoproj/argo-workflows/master/manifests/quick-start-postgres.yaml
下载yaml文件的时候可能会出现超时,我建议网络不好的墙内用户先把这个yaml文件下载下来保存到本地 命名为 quick-start-postgres.yaml
然后通过本地配置文件创建基础服务
kubectl apply -n argo -f quick-start-postgres.yaml
该命令实际上部署了三个主要的argo workflow的基础服务,
argo-server (argo workflow的主服务)
argo-controller (argo workflow接入的controller)
postgresql (argo workflow的元数据存储)
minio (argo workflow的artifactory)
墙内的用户启动会很慢,也可能失败。可以用以下命令查看argo workflow的基础服务状态。
[root@localhost ~]# kubectl get pods -n argo
NAME READY STATUS RESTARTS AGE
argo-server-5854fd8bf9-mh4zp 0/1 Error 0 13d
minio-79b96ccfb8-w6nvk 0/1 Error 5 13d
postgres-6b5c55f477-w7vxm 1/1 Running 1 13d
workflow-controller-66fd66b857-slv9g 0/1 ImagePullBackOff 2 13d
如果发现状态不正常, 可以kubectl describe pod 来查看pod的详细状态。我这里 agro server就error了。
[root@localhost ~]# kubectl describe pod argo-server-5854fd8bf9-mh4zp -n argo
~~~~~日志开始~~~~~~~~
Name: argo-server-5854fd8bf9-mh4zp
Namespace: argo
Priority: 0
Node: localhost.localdomain/192.168.126.129
Start Time: Sun, 24 Apr 2022 21:31:07 -0400
Labels: app=argo-server
pod-template-hash=5854fd8bf9
Status: Running
Controlled By: ReplicaSet/argo-server-5854fd8bf9
Containers:
argo-server:
Container ID: docker://4ab4dd785ad3ec7804d601ca629a4cf1334ff7c74b5f648ba02717b4438be5f7
Image: quay.io/argoproj/argocli:latest
Image ID: docker-pullable://quay.io/argoproj/argocli@sha256:7d1bfcc03c8ee2d12e4e4a18c89d9f59975a411f2054d4c49c28748bbafee5a3
Port: 2746/TCP
Host Port: 0/TCP
State: Terminated
Reason: Error
Exit Code: 255
Ready: False
Restart Count: 0
Readiness: http-get https://:2746/ delay=10s timeout=1s period=20s #success=1 #failure=3
Environment: <none>
Mounts:
/tmp from tmp (rw)
/var/run/secrets/kubernetes.io/serviceaccount from argo-server-token-5828x (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
tmp:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
argo-server-token-5828x:
Type: Secret (a volume populated by a Secret)
SecretName: argo-server-token-5828x
Optional: false
QoS Class: BestEffort
Node-Selectors: kubernetes.io/os=linux
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning Unhealthy 5d18h (x2 over 6d19h) kubelet Readiness probe failed: Get "https://172.17.0.13:2746/": net/http: request canceled (Client.Timeout exceeded while awaiting headers)
Warning FailedMount 7m11s kubelet MountVolume.SetUp failed for volume "argo-server-token-5828x" : failed to sync secret cache: timed out waiting for the condition
Normal SandboxChanged 7m10s kubelet Pod sandbox changed, it will be killed and re-created.
Normal Pulling 6m29s kubelet Pulling image "quay.io/argoproj/argocli:latest"
~~~~~~~~日志结束~~~~~~~~
可以看到因为拉镜像还没好,pod状态失败了。
墙内无法直接访问 quay.io和google的镜像。 我们可以通过梯子或者国内镜像站点。我没梯子,所以选择国内镜像站点。具体来说就是用中科大的镜像站点来下载。
docker pull quay.io/argoproj/argocli:latest => docker pull docker.mirrors.ustc.edu.cn/library/argocli:latest
等镜像都下载完毕服务就启动起来了。
2.3 运行hello world 验证服务
这个按照官方文档来就行
2.4 通过ingress export argo 服务
官方那个文档貌似有误。
我这里是通过nginx ingress来export服务的。
强调以下几点:
1 namespace 要为argo
2 backend protocol 要为 https
详情如下:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: argo-ingress
namespace: argo
annotations:
ingress.kubernetes.io/rewrite-target: /
ingress.kubernetes.io/protocol: https # Traefik
nginx.ingress.kubernetes.io/backend-protocol: https # ingress-nginx
spec:
defaultBackend:
service:
name: argo-server
port:
number: 2746
rules:
- http:
paths:
- backend:
service:
name: argo-server
port:
number: 2746
path: /
pathType: Prefix
ingress 创建好之后就可以通过 http协议在 宿主机之外访问argo workflow了。