本实验:
KubeMaster 10.4.7.20
KubeNode 10.4.7.21
(1)8个处理器,每个处理器2核,共16G内存
(2)查看root下的centos_kubeflowmaster-root下有超过100G足够的磁盘空间
[root@KubeflowMaster ~]# fdisk -l
Disk /dev/sda: 161.1 GB, 161061273600 bytes, 314572800 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk label type: dos
Disk identifier: 0x0005254b
Device Boot Start End Blocks Id System
/dev/sda1 * 2048 2099199 1048576 83 Linux
/dev/sda2 2099200 314572799 156236800 8e Linux LVM
Disk /dev/mapper/centos_kubeflowmaster-root: 139.6 GB, 139594825728 bytes, 272646144 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk /dev/mapper/centos_kubeflowmaster-swap: 9059 MB, 9059696640 bytes, 17694720 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk /dev/mapper/centos_kubeflowmaster-home: 11.3 GB,`在这里插入代码片` 11324620800 bytes, 22118400 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
(3)系统的k8s是1.18版本的,对应的可以安装1.02版本的kubeflow,不清楚可以查相应的官方文档
(1)从https://github.com/kubeflow/kfctl/releases/下载v1.0.2版本对应的kfctl二进制文件
kfctl_v1.0.2-0-ga476281_linux.tar.gz
解压安装包并添加到执行路径:
tar -xvf kfctl_v1.0.2-0-ga476281_linux.tar.gz
sudo cp kfctl /usr/bin
(2)由于生产服务器访问外网速度受限,可以在个人机器上下载manifests工具包到/root路径:
wget https://github.com/kubeflow/manifests/archive/v1.0.2.tar.gz
(3)从https://github.com/kubeflow/manifests/blob/v1.0-branch/kfdef/kfctl_k8s_istio.v1.0.2.yaml拷贝yaml文件到本地/root路径;
修改yaml文件的最后位置,将使用manifests文件从远程改为本地存放v1.0.2.tar.gz的路径
repos:
- name: manifests
uri: file:/root/v1.0.2.tar.gz
version: v1.0.2
status:
reposCache:
- localPath: '"/root/.cache/manifests/manifests-1.0.2"'
name: manifests
(4)手动下载安装kubeflow所需镜像文件,并打好tag,两台机器都要下载
docker pull istio/sidecar_injector:1.1.6
docker pull istio/proxyv2:1.1.6
docker pull istio/proxy_init:1.1.6
docker pull istio/pilot:1.1.6
docker pull istio/mixer:1.1.6
docker pull istio/galley:1.1.6
docker pull istio/citadel:1.1.6
docker pull sw1136562366/viewer-crd-controller:0.2.5
docker pull sw1136562366/api-server:0.2.5
docker pull sw1136562366/frontend:0.2.5
docker pull sw1136562366/visualization-server:0.2.5
docker pull sw1136562366/scheduledworkflow:0.2.5
docker pull sw1136562366/persistenceagent:0.2.5
docker pull sw1136562366/envoy:metadata-grpc
docker pull sw1136562366/profile-controller:v1.0.0-ge50a8531
docker pull sw1136562366/notebook-controller:v1.0.0-gcd65ce25
docker pull sw1136562366/katib-ui:v0.8.0
docker pull sw1136562366/katib-controller:v0.8.0
docker pull sw1136562366/katib-db-manager:v0.8.0
docker pull sw1136562366/jupyter-web-app:v1.0.0-g2bd63238
docker pull sw1136562366/centraldashboard:v1.0.0-g3ec0de71
docker pull sw1136562366/tf_operator:v1.0.0-g92389064
docker pull sw1136562366/pytorch-operator:v1.0.0-g047cf0f
docker pull sw1136562366/kfam:v1.0.0-gf3e09203
docker pull sw1136562366/admission-webhook:v1.0.0-gaf96e4e3
docker pull sw1136562366/metadata:v0.1.11
docker pull sw1136562366/metadata-frontend:v0.1.8
docker pull sw1136562366/application:1.0-beta
docker pull sw1136562366/ingress-setup:latest
docker pull sw1136562366/activator:latest
docker pull sw1136562366/webhook:latest
docker pull sw1136562366/controller:latest
docker pull sw1136562366/istio:latest
docker pull sw1136562366/autoscaler-hpa:latest
docker pull sw1136562366/autoscaler:latest
docker pull sw1136562366/kfserving-controller:0.2.2
docker pull sw1136562366/ml_metadata_store_server:v0.21.1
docker pull sw1136562366/spark-operator:v1beta2-1.0.0-2.4.4
docker pull sw1136562366/kube-rbac-proxy:v0.4.0
docker pull sw1136562366/spartakus-amd64:v1.1.0
docker pull argoproj/workflow-controller:v2.3.0
docker pull argoproj/argoui:v2.3.0
docker pull mysql:5.6
docker pull minio/minio:RELEASE.2018-02-09T22-40-05Z
docker pull mysql:8.0.3
docker pull mysql:8
# tag ml-pipeline images
docker tag docker.io/sw1136562366/viewer-crd-controller:0.2.5 gcr.io/ml-pipeline/viewer-crd-controller:0.2.5
docker tag docker.io/sw1136562366/api-server:0.2.5 gcr.io/ml-pipeline/api-server:0.2.5
docker tag docker.io/sw1136562366/frontend:0.2.5 gcr.io/ml-pipeline/frontend:0.2.5
docker tag docker.io/sw1136562366/visualization-server:0.2.5 gcr.io/ml-pipeline/visualization-server:0.2.5
docker tag docker.io/sw1136562366/scheduledworkflow:0.2.5 gcr.io/ml-pipeline/scheduledworkflow:0.2.5
docker tag docker.io/sw1136562366/persistenceagent:0.2.5 gcr.io/ml-pipeline/persistenceagent:0.2.5
docker tag docker.io/sw1136562366/envoy:metadata-grpc gcr.io/ml-pipeline/envoy:metadata-grpc
# tag kubeflow-images-public images
docker tag docker.io/sw1136562366/profile-controller:v1.0.0-ge50a8531 gcr.io/kubeflow-images-public/profile-controller:v1.0.0-ge50a8531
docker tag docker.io/sw1136562366/notebook-controller:v1.0.0-gcd65ce25 gcr.io/kubeflow-images-public/notebook-controller:v1.0.0-gcd65ce25
docker tag docker.io/sw1136562366/katib-ui:v0.8.0 gcr.io/kubeflow-images-public/katib/v1alpha3/katib-ui:v0.8.0
docker tag docker.io/sw1136562366/katib-controller:v0.8.0 gcr.io/kubeflow-images-public/katib/v1alpha3/katib-controller:v0.8.0
docker tag docker.io/sw1136562366/katib-db-manager:v0.8.0 gcr.io/kubeflow-images-public/katib/v1alpha3/katib-db-manager:v0.8.0
docker tag docker.io/sw1136562366/jupyter-web-app:v1.0.0-g2bd63238 gcr.io/kubeflow-images-public/jupyter-web-app:v1.0.0-g2bd63238
docker tag docker.io/sw1136562366/centraldashboard:v1.0.0-g3ec0de71 gcr.io/kubeflow-images-public/centraldashboard:v1.0.0-g3ec0de71
docker tag docker.io/sw1136562366/tf_operator:v1.0.0-g92389064 gcr.io/kubeflow-images-public/tf_operator:v1.0.0-g92389064
docker tag docker.io/sw1136562366/pytorch-operator:v1.0.0-g047cf0f gcr.io/kubeflow-images-public/pytorch-operator:v1.0.0-g047cf0f
docker tag docker.io/sw1136562366/kfam:v1.0.0-gf3e09203 gcr.io/kubeflow-images-public/kfam:v1.0.0-gf3e09203
docker tag docker.io/sw1136562366/admission-webhook:v1.0.0-gaf96e4e3 gcr.io/kubeflow-images-public/admission-webhook:v1.0.0-gaf96e4e3
docker tag docker.io/sw1136562366/metadata:v0.1.11 gcr.io/kubeflow-images-public/metadata:v0.1.11
docker tag docker.io/sw1136562366/metadata-frontend:v0.1.8 gcr.io/kubeflow-images-public/metadata-frontend:v0.1.8
docker tag docker.io/sw1136562366/application:1.0-beta gcr.io/kubeflow-images-public/kubernetes-sigs/application:1.0-beta
docker tag docker.io/sw1136562366/ingress-setup:latest gcr.io/kubeflow-images-public/ingress-setup:latest
# tag kubeflow-images-public images
docker tag docker.io/sw1136562366/activator:latest gcr.io/knative-releases/knative.dev/serving/cmd/activator:latest
docker tag docker.io/sw1136562366/webhook:latest gcr.io/knative-releases/knative.dev/serving/cmd/webhook:latest
docker tag docker.io/sw1136562366/controller:latest gcr.io/knative-releases/knative.dev/serving/cmd/controller:latest
docker tag docker.io/sw1136562366/istio:latest gcr.io/knative-releases/knative.dev/serving/cmd/networking/istio:latest
docker tag docker.io/sw1136562366/autoscaler-hpa:latest gcr.io/knative-releases/knative.dev/serving/cmd/autoscaler-hpa:latest
docker tag docker.io/sw1136562366/autoscaler:latest gcr.io/knative-releases/knative.dev/serving/cmd/autoscaler:latest
# tag knative-releases images
docker tag docker.io/sw1136562366/kfserving-controller:0.2.2 gcr.io/kfserving/kfserving-controller:0.2.2
docker tag docker.io/sw1136562366/ml_metadata_store_server:v0.21.1 gcr.io/tfx-oss-public/ml_metadata_store_server:v0.21.1
docker tag docker.io/sw1136562366/spark-operator:v1beta2-1.0.0-2.4.4 gcr.io/spark-operator/spark-operator:v1beta2-1.0.0-2.4.4
docker tag docker.io/sw1136562366/kube-rbac-proxy:v0.4.0 gcr.io/kubebuilder/kube-rbac-proxy:v0.4.0
docker tag docker.io/sw1136562366/spartakus-amd64:v1.1.0 gcr.io/google_containers/spartakus-amd64:v1.1.0
(5)再Master配置环境部署kubeflow
export BASE_DIR=/data/
export KF_NAME=my-kubeflow
export KF_DIR=${BASE_DIR}/${KF_NAME}
export CONFIG_FILE="/root/kfctl_k8s_istio.v1.0.2.yaml"
mkdir -p ${KF_DIR}
cd ${KF_DIR}
kfctl build -V -f ${CONFIG_FILE}
kfctl apply -V -f ${CONFIG_FILE}
如果显示警告,则继续等待
如果显示报错,是由于cert-manager和istio-system镜像安装需时间,继续等待
(6)apply 报错停止执行后
修改 /etc/kubernetes/manifest/kube-apiserver.yaml,
添加–enable-aggregator-routing=true
重新执行 kfctl apply -V -f ${CONFIG_FILE}
正常输出完日志即表示镜像安装完成,接下来等待pod安装完毕即可
但是这些pod只有一部分running,修改
(7)修改pod配置
xxx是对应的deploy和statefulset的名字,因为有很多个,所以要一个一个去改,一般是下面状态为 ImagePullBackOff 的几个,然后删除相应的pod
#修改命令
kubectl edit deploy -n kubeflow xxx
kubectl edit sts -n kubeflow xxx
#删除命令,把它删了,会自动重启的
kubectl delete pods -n kubeflow xxx
修改imagePullPolicy:Always
---->imagePullPolicy:IfNotPresent
sts) admission-webhook-bootstrap-stateful-set
sts) kfserving-controller-manager
deploy) jupyter-web-app-deployment在这里插入代码片
deploy) ml-pipeline-viewer-controller-deployment
deploy) kubeflow notebook-controller-deployment
deploy) profiles-deployment
(8)pod状态为Pending的,是因为没有数据库没有起来,这是要创建pv,这个地方要开启NFS服务,创建共享文件夹
#查看pvc
[root@KubeflowMaster ~]# kubectl get pvc --all-namespaces -o wide
NAMESPACE NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE VOLUMEMODE
kubeflow katib-mysql
kubeflow metadata-mysql
kubeflow minio-pv-claim
kubeflow mysql-pv-claim
#创建pv
[root@KubeflowMaster ~]# vi pv-damo.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv001
labels:
name: pv001
spec:
nfs:
path: /nfs/data/v1
server: 10.4.7.20
accessModes: ["ReadWriteMany","ReadWriteOnce"]
capacity:
storage: 15Gi
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv002
labels:
name: pv002
spec:
nfs:
path: /nfs/data/v2
server: 10.4.7.20
accessModes: ["ReadWriteMany","ReadWriteOnce"]
capacity:
storage: 15Gi
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv003
labels:
name: pv003
spec:
nfs:
path: /nfs/data/v3
server: 10.4.7.20
accessModes: ["ReadWriteMany","ReadWriteOnce"]
capacity:
storage: 25Gi
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv004
labels:
name: pv004
spec:
nfs:
path: /nfs/data/v4
server: 10.4.7.20
accessModes: ["ReadWriteMany","ReadWriteOnce"]
capacity:
storage: 25Gi
#apply文件
kubectl apply -f pv-damo.yaml
#查看pv是否绑定上pv
[root@KubeflowMaster ~]# kubectl get pvc --all-namespaces -o wide
NAMESPACE NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE VOLUMEMODE
kubeflow katib-mysql Bound pv002 15Gi RWO 23h Filesystem
kubeflow metadata-mysql Bound pv001 15Gi RWO,RWX 23h Filesystem
kubeflow minio-pv-claim Bound pv004 25Gi RWO,RWX 23h Filesystem
kubeflow mysql-pv-claim Bound pv003 25Gi RWO,RWX 23h Filesystem
此时,涉及数据库的几个pod变成 ContainerCreating,等待pod变成running
若ContainerCreating–>CrashLoopBackOff ,请删除对应的pod,重新创建,本次搭建时删除pod后变成了running
(9)pod状态为complited的pod,不需要做处理,该pod是OK的
(10)查看pod状态
[root@KubeflowMaster ~]# kubectl get pods --all-namespaces -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
cert-manager cert-manager-bcc4f4644-92csg 1/1 Running 6 23h 10.244.1.19 kubeflownode <none> <none>
cert-manager cert-manager-cainjector-6c6cb7b87f-hsnbz 1/1 Running 29 23h 10.244.1.17 kubeflownode <none> <none>
cert-manager cert-manager-webhook-5c94b9f68b-7ws6z 1/1 Running 0 23h 10.244.1.24 kubeflownode <none> <none>
istio-system cluster-local-gateway-d7df4cdc8-l4kgb 1/1 Running 0 23h 10.244.1.15 kubeflownode <none> <none>
istio-system grafana-6875655fd8-t7zzf 1/1 Running 0 23h 10.244.1.4 kubeflownode <none> <none>
istio-system istio-citadel-7578b8f8f8-4s4cs 1/1 Running 0 23h 10.244.1.14 kubeflownode <none> <none>
istio-system istio-cleanup-secrets-1.1.6-tdrmb 0/1 Completed 0 23h 10.244.1.10 kubeflownode <none> <none>
istio-system istio-egressgateway-594f7cc859-lnzz7 1/1 Running 0 23h 10.244.1.3 kubeflownode <none> <none>
istio-system istio-galley-86b744bd79-k85ps 1/1 Running 0 23h 10.244.1.22 kubeflownode <none> <none>
istio-system istio-grafana-post-install-1.1.6-g6jb8 0/1 Completed 0 23h 10.244.1.12 kubeflownode <none> <none>
istio-system istio-ingressgateway-5dbccf544-cpbp6 1/1 Running 0 23h 10.244.1.5 kubeflownode <none> <none>
istio-system istio-pilot-7564c48698-hzrh5 2/2 Running 3 23h 10.244.1.6 kubeflownode <none> <none>
istio-system istio-policy-58988894cd-h7nc5 2/2 Running 12 23h 10.244.1.8 kubeflownode <none> <none>
istio-system istio-security-post-install-1.1.6-v8kr2 0/1 Completed 0 23h 10.244.1.11 kubeflownode <none> <none>
istio-system istio-sidecar-injector-867f97b975-9xv4g 1/1 Running 0 23h 10.244.1.23 kubeflownode <none> <none>
istio-system istio-telemetry-68dddbcd46-ts5kn 2/2 Running 22 23h 10.244.1.9 kubeflownode <none> <none>
istio-system istio-tracing-75fcff7c44-jcxc9 1/1 Running 1 23h 10.244.1.7 kubeflownode <none> <none>
istio-system kfserving-ingressgateway-66dbdc5b98-t7lvl 1/1 Running 0 23h 10.244.1.18 kubeflownode <none> <none>
istio-system kiali-79dbdbb6d-kltsg 1/1 Running 0 23h 10.244.1.13 kubeflownode <none> <none>
istio-system prometheus-5c958dc796-rjw64 1/1 Running 1 23h 10.244.1.21 kubeflownode <none> <none>
knative-serving activator-78c895d774-2rq7z 1/2 ImagePullBackOff 0 5h47m 10.244.1.57 kubeflownode <none> <none>
knative-serving autoscaler-7cfc48cf7-sxhvn 1/2 ImagePullBackOff 0 5h47m 10.244.1.58 kubeflownode <none> <none>
knative-serving autoscaler-hpa-867dddf-4gb8k 0/1 ImagePullBackOff 0 5h47m 10.244.1.52 kubeflownode <none> <none>
knative-serving controller-986646d76-l48bn 0/1 ImagePullBackOff 0 5h47m 10.244.1.61 kubeflownode <none> <none>
knative-serving networking-istio-79b5d9cf79-4cqwb 0/1 ImagePullBackOff 0 5h47m 10.244.1.49 kubeflownode <none> <none>
knative-serving webhook-5dbc56cccf-kdbnj 0/1 ImagePullBackOff 0 5h47m 10.244.1.64 kubeflownode <none> <none>
kube-system coredns-7ff77c879f-pcnl6 1/1 Running 0 7h42m 10.244.1.29 kubeflownode <none> <none>
kube-system coredns-7ff77c879f-znlhl 1/1 Running 0 7h42m 10.244.1.28 kubeflownode <none> <none>
kube-system etcd-kubeflowmaster 1/1 Running 2 26h 10.4.7.20 kubeflowmaster <none> <none>
kube-system kube-apiserver-kubeflowmaster 1/1 Running 0 5h49m 10.4.7.20 kubeflowmaster <none> <none>
kube-system kube-controller-manager-kubeflowmaster 1/1 Running 23 26h 10.4.7.20 kubeflowmaster <none> <none>
kube-system kube-flannel-ds-cvwrg 1/1 Running 0 26h 10.4.7.21 kubeflownode <none> <none>
kube-system kube-flannel-ds-xqdrf 1/1 Running 0 26h 10.4.7.20 kubeflowmaster <none> <none>
kube-system kube-proxy-bq9j7 1/1 Running 0 26h 10.4.7.21 kubeflownode <none> <none>
kube-system kube-proxy-n27fg 1/1 Running 0 26h 10.4.7.20 kubeflowmaster <none> <none>
kube-system kube-scheduler-kubeflowmaster 1/1 Running 19 26h 10.4.7.20 kubeflowmaster <none> <none>
kubeflow admission-webhook-bootstrap-stateful-set-0 1/1 Running 0 149m 10.244.1.75 kubeflownode <none> <none>
kubeflow admission-webhook-deployment-8545586776-d6nx9 1/1 Running 0 149m 10.244.1.76 kubeflownode <none> <none>
kubeflow application-controller-stateful-set-0 1/1 Running 0 139m 10.244.1.100 kubeflownode <none> <none>
kubeflow argo-ui-59f8d49b9-647pj 1/1 Running 0 7h45m 10.244.1.26 kubeflownode <none> <none>
kubeflow centraldashboard-686dc58fcf-n8n6n 1/1 Running 0 148m 10.244.1.78 kubeflownode <none> <none>
kubeflow jupyter-web-app-deployment-7cdfd95648-sb7jl 1/1 Running 0 137m 10.244.1.101 kubeflownode <none> <none>
kubeflow katib-controller-5c976769d8-7dcdd 1/1 Running 0 148m 10.244.1.80 kubeflownode <none> <none>
kubeflow katib-db-manager-bf77df6d6-lpmrr 1/1 Running 0 12m 10.244.1.118 kubeflownode <none> <none>
kubeflow katib-mysql-7db488768f-jztqg 1/1 Running 1 5h47m 10.244.1.113 kubeflownode <none> <none>
kubeflow katib-ui-6d7fbfffcb-wdhpz 1/1 Running 0 148m 10.244.1.82 kubeflownode <none> <none>
kubeflow kfserving-controller-manager-0 2/2 Running 1 127m 10.244.1.105 kubeflownode <none> <none>
kubeflow metacontroller-0 1/1 Running 0 7h45m 10.244.1.27 kubeflownode <none> <none>
kubeflow metadata-db-5d56786648-k74ms 1/1 Running 0 5h48m 10.244.1.111 kubeflownode <none> <none>
kubeflow metadata-deployment-5c7df888b9-28n8q 1/1 Running 6 99m 10.244.1.110 kubeflownode <none> <none>
kubeflow metadata-envoy-deployment-7cc78946c9-9k5k2 1/1 Running 0 147m 10.244.1.85 kubeflownode <none> <none>
kubeflow metadata-grpc-deployment-5c8545f76f-wxkvv 1/1 Running 29 147m 10.244.1.86 kubeflownode <none> <none>
kubeflow metadata-ui-665dff6f55-zb98f 1/1 Running 0 147m 10.244.1.87 kubeflownode <none> <none>
kubeflow minio-657c66cd9-q2pdw 1/1 Running 0 5h47m 10.244.1.114 kubeflownode <none> <none>
kubeflow ml-pipeline-669cdb6bdf-wt2hp 1/1 Running 0 12m 10.244.1.115 kubeflownode <none> <none>
kubeflow ml-pipeline-ml-pipeline-visualizationserver-777d4b4645-pzcgh 1/1 Running 0 147m 10.244.1.90 kubeflownode <none> <none>
kubeflow ml-pipeline-persistenceagent-56467f8856-5vnf4 1/1 Running 21 147m 10.244.1.89 kubeflownode <none> <none>
kubeflow ml-pipeline-scheduledworkflow-548b96d5fc-b6j5k 1/1 Running 0 11m 10.244.1.119 kubeflownode <none> <none>
kubeflow ml-pipeline-ui-6bd4778958-8v4x4 1/1 Running 0 147m 10.244.1.92 kubeflownode <none> <none>
kubeflow ml-pipeline-viewer-controller-deployment-bd64d97f9-cxh85 1/1 Running 0 125m 10.244.1.107 kubeflownode <none> <none>
kubeflow mysql-8558d86476-6hc24 1/1 Running 0 5h47m 10.244.1.112 kubeflownode <none> <none>
kubeflow notebook-controller-deployment-75bb4445c4-65n45 1/1 Running 0 123m 10.244.1.108 kubeflownode <none> <none>
kubeflow profiles-deployment-5f7dd7567f-5nk82 2/2 Running 0 121m 10.244.1.109 kubeflownode <none> <none>
kubeflow pytorch-operator-6bc9c99c5-z5dng 1/1 Running 1 12m 10.244.1.116 kubeflownode <none> <none>
kubeflow seldon-controller-manager-786775d4d9-lr2lg 1/1 Running 1 12m 10.244.1.117 kubeflownode <none> <none>
kubeflow spark-operatorsparkoperator-9c559c997-664l9 1/1 Running 0 146m 10.244.1.97 kubeflownode <none> <none>
kubeflow spartakus-volunteer-5978bf56f-kq2hc 1/1 Running 0 146m 10.244.1.98 kubeflownode <none> <none>
kubeflow tensorboard-9b4c44f45-7qlcv 1/1 Running 0 5h47m 10.244.1.51 kubeflownode <none> <none>
kubeflow tf-job-operator-5d7cc587c5-87nc6 1/1 Running 2 11m 10.244.1.120 kubeflownode <none> <none>
kubeflow workflow-controller-59ff5f7874-6fkgj 1/1 Running 0 7h45m 10.244.1.25 kubeflownode <none> <none>
(11)启动kubeflow ui
如果以上部分正常走完,接下来使用k8s端口转发启动ui即可:
# 端口转发等待时长1 month
nohup kubectl port-forward -n istio-system svc/istio-ingressgateway 8088:80 --pod-running-timeout=720h --address=0.0.0.0 &
接下来去访问你的ui即可: http://hostip:8088
接下来去访问你的ui即可: http://10.4.7.20:8088
(12)若UI无法访问或者访问过慢
教程:https://www.cnblogs.com/xing901022/p/13455513.html
ui启动访问不了页面
export NAMESPACE=istio-system
kubectl port-forward --address 0.0.0.0 -n ${NAMESPACE} svc/istio-ingressgateway 8080:80
然后访问目标地址即可,如localhost:8080