当前位置: 首页 > 工具软件 > Kubeflow > 使用案例 >

安装kubeflow的经验

应和光
2023-12-01

1、用kubeadm安装好k8s集群

本实验:
KubeMaster 10.4.7.20
KubeNode 10.4.7.21

2、确认机器的配置

(1)8个处理器,每个处理器2核,共16G内存
(2)查看root下的centos_kubeflowmaster-root下有超过100G足够的磁盘空间

[root@KubeflowMaster ~]# fdisk -l

Disk /dev/sda: 161.1 GB, 161061273600 bytes, 314572800 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk label type: dos
Disk identifier: 0x0005254b

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *        2048     2099199     1048576   83  Linux
/dev/sda2         2099200   314572799   156236800   8e  Linux LVM

Disk /dev/mapper/centos_kubeflowmaster-root: 139.6 GB, 139594825728 bytes, 272646144 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes


Disk /dev/mapper/centos_kubeflowmaster-swap: 9059 MB, 9059696640 bytes, 17694720 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes


Disk /dev/mapper/centos_kubeflowmaster-home: 11.3 GB,`在这里插入代码片` 11324620800 bytes, 22118400 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes


(3)系统的k8s是1.18版本的,对应的可以安装1.02版本的kubeflow,不清楚可以查相应的官方文档

3、按部就班安装kubeflow

(1)从https://github.com/kubeflow/kfctl/releases/下载v1.0.2版本对应的kfctl二进制文件
kfctl_v1.0.2-0-ga476281_linux.tar.gz

解压安装包并添加到执行路径:

tar -xvf kfctl_v1.0.2-0-ga476281_linux.tar.gz
sudo cp kfctl /usr/bin

(2)由于生产服务器访问外网速度受限,可以在个人机器上下载manifests工具包到/root路径:

wget https://github.com/kubeflow/manifests/archive/v1.0.2.tar.gz

(3)从https://github.com/kubeflow/manifests/blob/v1.0-branch/kfdef/kfctl_k8s_istio.v1.0.2.yaml拷贝yaml文件到本地/root路径;
修改yaml文件的最后位置,将使用manifests文件从远程改为本地存放v1.0.2.tar.gz的路径

  repos:
  - name: manifests
    uri: file:/root/v1.0.2.tar.gz
  version: v1.0.2
status:
  reposCache:
  - localPath: '"/root/.cache/manifests/manifests-1.0.2"'
    name: manifests

(4)手动下载安装kubeflow所需镜像文件,并打好tag,两台机器都要下载

docker pull istio/sidecar_injector:1.1.6
docker pull istio/proxyv2:1.1.6
docker pull istio/proxy_init:1.1.6
docker pull istio/pilot:1.1.6
docker pull istio/mixer:1.1.6
docker pull istio/galley:1.1.6
docker pull istio/citadel:1.1.6

docker pull sw1136562366/viewer-crd-controller:0.2.5
docker pull sw1136562366/api-server:0.2.5
docker pull sw1136562366/frontend:0.2.5
docker pull sw1136562366/visualization-server:0.2.5
docker pull sw1136562366/scheduledworkflow:0.2.5
docker pull sw1136562366/persistenceagent:0.2.5
docker pull sw1136562366/envoy:metadata-grpc

docker pull sw1136562366/profile-controller:v1.0.0-ge50a8531
docker pull sw1136562366/notebook-controller:v1.0.0-gcd65ce25
docker pull sw1136562366/katib-ui:v0.8.0
docker pull sw1136562366/katib-controller:v0.8.0
docker pull sw1136562366/katib-db-manager:v0.8.0
docker pull sw1136562366/jupyter-web-app:v1.0.0-g2bd63238
docker pull sw1136562366/centraldashboard:v1.0.0-g3ec0de71
docker pull sw1136562366/tf_operator:v1.0.0-g92389064
docker pull sw1136562366/pytorch-operator:v1.0.0-g047cf0f
docker pull sw1136562366/kfam:v1.0.0-gf3e09203
docker pull sw1136562366/admission-webhook:v1.0.0-gaf96e4e3
docker pull sw1136562366/metadata:v0.1.11
docker pull sw1136562366/metadata-frontend:v0.1.8
docker pull sw1136562366/application:1.0-beta
docker pull sw1136562366/ingress-setup:latest

docker pull sw1136562366/activator:latest
docker pull sw1136562366/webhook:latest
docker pull sw1136562366/controller:latest
docker pull sw1136562366/istio:latest
docker pull sw1136562366/autoscaler-hpa:latest
docker pull sw1136562366/autoscaler:latest

docker pull sw1136562366/kfserving-controller:0.2.2
docker pull sw1136562366/ml_metadata_store_server:v0.21.1
docker pull sw1136562366/spark-operator:v1beta2-1.0.0-2.4.4
docker pull sw1136562366/kube-rbac-proxy:v0.4.0
docker pull sw1136562366/spartakus-amd64:v1.1.0

docker pull argoproj/workflow-controller:v2.3.0
docker pull argoproj/argoui:v2.3.0

docker pull mysql:5.6
docker pull minio/minio:RELEASE.2018-02-09T22-40-05Z
docker pull mysql:8.0.3
docker pull mysql:8


# tag ml-pipeline images
docker tag docker.io/sw1136562366/viewer-crd-controller:0.2.5 gcr.io/ml-pipeline/viewer-crd-controller:0.2.5
docker tag docker.io/sw1136562366/api-server:0.2.5 gcr.io/ml-pipeline/api-server:0.2.5
docker tag docker.io/sw1136562366/frontend:0.2.5 gcr.io/ml-pipeline/frontend:0.2.5
docker tag docker.io/sw1136562366/visualization-server:0.2.5 gcr.io/ml-pipeline/visualization-server:0.2.5
docker tag docker.io/sw1136562366/scheduledworkflow:0.2.5 gcr.io/ml-pipeline/scheduledworkflow:0.2.5
docker tag docker.io/sw1136562366/persistenceagent:0.2.5 gcr.io/ml-pipeline/persistenceagent:0.2.5
docker tag docker.io/sw1136562366/envoy:metadata-grpc gcr.io/ml-pipeline/envoy:metadata-grpc

# tag kubeflow-images-public images
docker tag docker.io/sw1136562366/profile-controller:v1.0.0-ge50a8531 gcr.io/kubeflow-images-public/profile-controller:v1.0.0-ge50a8531
docker tag docker.io/sw1136562366/notebook-controller:v1.0.0-gcd65ce25 gcr.io/kubeflow-images-public/notebook-controller:v1.0.0-gcd65ce25
docker tag docker.io/sw1136562366/katib-ui:v0.8.0 gcr.io/kubeflow-images-public/katib/v1alpha3/katib-ui:v0.8.0
docker tag docker.io/sw1136562366/katib-controller:v0.8.0 gcr.io/kubeflow-images-public/katib/v1alpha3/katib-controller:v0.8.0
docker tag docker.io/sw1136562366/katib-db-manager:v0.8.0 gcr.io/kubeflow-images-public/katib/v1alpha3/katib-db-manager:v0.8.0
docker tag docker.io/sw1136562366/jupyter-web-app:v1.0.0-g2bd63238 gcr.io/kubeflow-images-public/jupyter-web-app:v1.0.0-g2bd63238
docker tag docker.io/sw1136562366/centraldashboard:v1.0.0-g3ec0de71 gcr.io/kubeflow-images-public/centraldashboard:v1.0.0-g3ec0de71
docker tag docker.io/sw1136562366/tf_operator:v1.0.0-g92389064 gcr.io/kubeflow-images-public/tf_operator:v1.0.0-g92389064
docker tag docker.io/sw1136562366/pytorch-operator:v1.0.0-g047cf0f gcr.io/kubeflow-images-public/pytorch-operator:v1.0.0-g047cf0f
docker tag docker.io/sw1136562366/kfam:v1.0.0-gf3e09203 gcr.io/kubeflow-images-public/kfam:v1.0.0-gf3e09203
docker tag docker.io/sw1136562366/admission-webhook:v1.0.0-gaf96e4e3 gcr.io/kubeflow-images-public/admission-webhook:v1.0.0-gaf96e4e3
docker tag docker.io/sw1136562366/metadata:v0.1.11 gcr.io/kubeflow-images-public/metadata:v0.1.11
docker tag docker.io/sw1136562366/metadata-frontend:v0.1.8 gcr.io/kubeflow-images-public/metadata-frontend:v0.1.8
docker tag docker.io/sw1136562366/application:1.0-beta gcr.io/kubeflow-images-public/kubernetes-sigs/application:1.0-beta
docker tag docker.io/sw1136562366/ingress-setup:latest gcr.io/kubeflow-images-public/ingress-setup:latest

# tag kubeflow-images-public images
docker tag docker.io/sw1136562366/activator:latest gcr.io/knative-releases/knative.dev/serving/cmd/activator:latest
docker tag docker.io/sw1136562366/webhook:latest gcr.io/knative-releases/knative.dev/serving/cmd/webhook:latest
docker tag docker.io/sw1136562366/controller:latest gcr.io/knative-releases/knative.dev/serving/cmd/controller:latest
docker tag docker.io/sw1136562366/istio:latest gcr.io/knative-releases/knative.dev/serving/cmd/networking/istio:latest
docker tag docker.io/sw1136562366/autoscaler-hpa:latest gcr.io/knative-releases/knative.dev/serving/cmd/autoscaler-hpa:latest
docker tag docker.io/sw1136562366/autoscaler:latest gcr.io/knative-releases/knative.dev/serving/cmd/autoscaler:latest

# tag knative-releases images
docker tag docker.io/sw1136562366/kfserving-controller:0.2.2 gcr.io/kfserving/kfserving-controller:0.2.2
docker tag docker.io/sw1136562366/ml_metadata_store_server:v0.21.1 gcr.io/tfx-oss-public/ml_metadata_store_server:v0.21.1
docker tag docker.io/sw1136562366/spark-operator:v1beta2-1.0.0-2.4.4 gcr.io/spark-operator/spark-operator:v1beta2-1.0.0-2.4.4
docker tag docker.io/sw1136562366/kube-rbac-proxy:v0.4.0 gcr.io/kubebuilder/kube-rbac-proxy:v0.4.0
docker tag docker.io/sw1136562366/spartakus-amd64:v1.1.0 gcr.io/google_containers/spartakus-amd64:v1.1.0

(5)再Master配置环境部署kubeflow

export BASE_DIR=/data/
export KF_NAME=my-kubeflow
export KF_DIR=${BASE_DIR}/${KF_NAME}
export CONFIG_FILE="/root/kfctl_k8s_istio.v1.0.2.yaml"

mkdir -p ${KF_DIR}
cd ${KF_DIR}
kfctl build -V -f ${CONFIG_FILE}
kfctl apply -V -f ${CONFIG_FILE}

如果显示警告,则继续等待
如果显示报错,是由于cert-manager和istio-system镜像安装需时间,继续等待

(6)apply 报错停止执行后
修改 /etc/kubernetes/manifest/kube-apiserver.yaml,
添加–enable-aggregator-routing=true
重新执行 kfctl apply -V -f ${CONFIG_FILE}
正常输出完日志即表示镜像安装完成,接下来等待pod安装完毕即可
但是这些pod只有一部分running,修改

(7)修改pod配置

xxx是对应的deploy和statefulset的名字,因为有很多个,所以要一个一个去改,一般是下面状态为 ImagePullBackOff 的几个,然后删除相应的pod

#修改命令
kubectl edit deploy -n kubeflow xxx
kubectl edit sts -n kubeflow xxx

#删除命令,把它删了,会自动重启的
kubectl delete pods -n kubeflow xxx

修改imagePullPolicy:Always
---->imagePullPolicy:IfNotPresent
sts) admission-webhook-bootstrap-stateful-set
sts) kfserving-controller-manager
deploy) jupyter-web-app-deployment在这里插入代码片
deploy) ml-pipeline-viewer-controller-deployment
deploy) kubeflow notebook-controller-deployment
deploy) profiles-deployment

(8)pod状态为Pending的,是因为没有数据库没有起来,这是要创建pv,这个地方要开启NFS服务,创建共享文件夹

#查看pvc
[root@KubeflowMaster ~]#  kubectl get pvc --all-namespaces -o wide
NAMESPACE   NAME             STATUS   VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS   AGE   VOLUMEMODE
kubeflow    katib-mysql                         
kubeflow    metadata-mysql  
kubeflow    minio-pv-claim  
kubeflow    mysql-pv-claim 

#创建pv
[root@KubeflowMaster ~]# vi pv-damo.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv001
  labels:
    name: pv001
spec:
  nfs:
    path: /nfs/data/v1
    server: 10.4.7.20
  accessModes: ["ReadWriteMany","ReadWriteOnce"]
  capacity:
    storage: 15Gi
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv002
  labels:
    name: pv002
spec:
  nfs:
    path: /nfs/data/v2
    server: 10.4.7.20
  accessModes: ["ReadWriteMany","ReadWriteOnce"]
  capacity:
    storage: 15Gi
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv003
  labels:
    name: pv003
spec:
  nfs:
    path: /nfs/data/v3
    server: 10.4.7.20
  accessModes: ["ReadWriteMany","ReadWriteOnce"]
  capacity:
    storage: 25Gi
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv004
  labels:
    name: pv004
spec:
  nfs:
    path: /nfs/data/v4
    server: 10.4.7.20
  accessModes: ["ReadWriteMany","ReadWriteOnce"]
  capacity:
    storage: 25Gi

#apply文件
kubectl apply -f pv-damo.yaml
#查看pv是否绑定上pv
[root@KubeflowMaster ~]#  kubectl get pvc --all-namespaces -o wide
NAMESPACE   NAME             STATUS   VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS   AGE   VOLUMEMODE
kubeflow    katib-mysql      Bound    pv002    15Gi       RWO                           23h   Filesystem
kubeflow    metadata-mysql   Bound    pv001    15Gi       RWO,RWX                       23h   Filesystem
kubeflow    minio-pv-claim   Bound    pv004    25Gi       RWO,RWX                       23h   Filesystem
kubeflow    mysql-pv-claim   Bound    pv003    25Gi       RWO,RWX                       23h   Filesystem

此时,涉及数据库的几个pod变成 ContainerCreating,等待pod变成running
若ContainerCreating–>CrashLoopBackOff ,请删除对应的pod,重新创建,本次搭建时删除pod后变成了running

(9)pod状态为complited的pod,不需要做处理,该pod是OK的

(10)查看pod状态

[root@KubeflowMaster ~]#  kubectl get pods --all-namespaces -o wide
NAMESPACE         NAME                                                           READY   STATUS             RESTARTS   AGE     IP             NODE             NOMINATED NODE   READINESS GATES
cert-manager      cert-manager-bcc4f4644-92csg                                   1/1     Running            6          23h     10.244.1.19    kubeflownode     <none>           <none>
cert-manager      cert-manager-cainjector-6c6cb7b87f-hsnbz                       1/1     Running            29         23h     10.244.1.17    kubeflownode     <none>           <none>
cert-manager      cert-manager-webhook-5c94b9f68b-7ws6z                          1/1     Running            0          23h     10.244.1.24    kubeflownode     <none>           <none>
istio-system      cluster-local-gateway-d7df4cdc8-l4kgb                          1/1     Running            0          23h     10.244.1.15    kubeflownode     <none>           <none>
istio-system      grafana-6875655fd8-t7zzf                                       1/1     Running            0          23h     10.244.1.4     kubeflownode     <none>           <none>
istio-system      istio-citadel-7578b8f8f8-4s4cs                                 1/1     Running            0          23h     10.244.1.14    kubeflownode     <none>           <none>
istio-system      istio-cleanup-secrets-1.1.6-tdrmb                              0/1     Completed          0          23h     10.244.1.10    kubeflownode     <none>           <none>
istio-system      istio-egressgateway-594f7cc859-lnzz7                           1/1     Running            0          23h     10.244.1.3     kubeflownode     <none>           <none>
istio-system      istio-galley-86b744bd79-k85ps                                  1/1     Running            0          23h     10.244.1.22    kubeflownode     <none>           <none>
istio-system      istio-grafana-post-install-1.1.6-g6jb8                         0/1     Completed          0          23h     10.244.1.12    kubeflownode     <none>           <none>
istio-system      istio-ingressgateway-5dbccf544-cpbp6                           1/1     Running            0          23h     10.244.1.5     kubeflownode     <none>           <none>
istio-system      istio-pilot-7564c48698-hzrh5                                   2/2     Running            3          23h     10.244.1.6     kubeflownode     <none>           <none>
istio-system      istio-policy-58988894cd-h7nc5                                  2/2     Running            12         23h     10.244.1.8     kubeflownode     <none>           <none>
istio-system      istio-security-post-install-1.1.6-v8kr2                        0/1     Completed          0          23h     10.244.1.11    kubeflownode     <none>           <none>
istio-system      istio-sidecar-injector-867f97b975-9xv4g                        1/1     Running            0          23h     10.244.1.23    kubeflownode     <none>           <none>
istio-system      istio-telemetry-68dddbcd46-ts5kn                               2/2     Running            22         23h     10.244.1.9     kubeflownode     <none>           <none>
istio-system      istio-tracing-75fcff7c44-jcxc9                                 1/1     Running            1          23h     10.244.1.7     kubeflownode     <none>           <none>
istio-system      kfserving-ingressgateway-66dbdc5b98-t7lvl                      1/1     Running            0          23h     10.244.1.18    kubeflownode     <none>           <none>
istio-system      kiali-79dbdbb6d-kltsg                                          1/1     Running            0          23h     10.244.1.13    kubeflownode     <none>           <none>
istio-system      prometheus-5c958dc796-rjw64                                    1/1     Running            1          23h     10.244.1.21    kubeflownode     <none>           <none>
knative-serving   activator-78c895d774-2rq7z                                     1/2     ImagePullBackOff   0          5h47m   10.244.1.57    kubeflownode     <none>           <none>
knative-serving   autoscaler-7cfc48cf7-sxhvn                                     1/2     ImagePullBackOff   0          5h47m   10.244.1.58    kubeflownode     <none>           <none>
knative-serving   autoscaler-hpa-867dddf-4gb8k                                   0/1     ImagePullBackOff   0          5h47m   10.244.1.52    kubeflownode     <none>           <none>
knative-serving   controller-986646d76-l48bn                                     0/1     ImagePullBackOff   0          5h47m   10.244.1.61    kubeflownode     <none>           <none>
knative-serving   networking-istio-79b5d9cf79-4cqwb                              0/1     ImagePullBackOff   0          5h47m   10.244.1.49    kubeflownode     <none>           <none>
knative-serving   webhook-5dbc56cccf-kdbnj                                       0/1     ImagePullBackOff   0          5h47m   10.244.1.64    kubeflownode     <none>           <none>
kube-system       coredns-7ff77c879f-pcnl6                                       1/1     Running            0          7h42m   10.244.1.29    kubeflownode     <none>           <none>
kube-system       coredns-7ff77c879f-znlhl                                       1/1     Running            0          7h42m   10.244.1.28    kubeflownode     <none>           <none>
kube-system       etcd-kubeflowmaster                                            1/1     Running            2          26h     10.4.7.20      kubeflowmaster   <none>           <none>
kube-system       kube-apiserver-kubeflowmaster                                  1/1     Running            0          5h49m   10.4.7.20      kubeflowmaster   <none>           <none>
kube-system       kube-controller-manager-kubeflowmaster                         1/1     Running            23         26h     10.4.7.20      kubeflowmaster   <none>           <none>
kube-system       kube-flannel-ds-cvwrg                                          1/1     Running            0          26h     10.4.7.21      kubeflownode     <none>           <none>
kube-system       kube-flannel-ds-xqdrf                                          1/1     Running            0          26h     10.4.7.20      kubeflowmaster   <none>           <none>
kube-system       kube-proxy-bq9j7                                               1/1     Running            0          26h     10.4.7.21      kubeflownode     <none>           <none>
kube-system       kube-proxy-n27fg                                               1/1     Running            0          26h     10.4.7.20      kubeflowmaster   <none>           <none>
kube-system       kube-scheduler-kubeflowmaster                                  1/1     Running            19         26h     10.4.7.20      kubeflowmaster   <none>           <none>
kubeflow          admission-webhook-bootstrap-stateful-set-0                     1/1     Running            0          149m    10.244.1.75    kubeflownode     <none>           <none>
kubeflow          admission-webhook-deployment-8545586776-d6nx9                  1/1     Running            0          149m    10.244.1.76    kubeflownode     <none>           <none>
kubeflow          application-controller-stateful-set-0                          1/1     Running            0          139m    10.244.1.100   kubeflownode     <none>           <none>
kubeflow          argo-ui-59f8d49b9-647pj                                        1/1     Running            0          7h45m   10.244.1.26    kubeflownode     <none>           <none>
kubeflow          centraldashboard-686dc58fcf-n8n6n                              1/1     Running            0          148m    10.244.1.78    kubeflownode     <none>           <none>
kubeflow          jupyter-web-app-deployment-7cdfd95648-sb7jl                    1/1     Running            0          137m    10.244.1.101   kubeflownode     <none>           <none>
kubeflow          katib-controller-5c976769d8-7dcdd                              1/1     Running            0          148m    10.244.1.80    kubeflownode     <none>           <none>
kubeflow          katib-db-manager-bf77df6d6-lpmrr                               1/1     Running            0          12m     10.244.1.118   kubeflownode     <none>           <none>
kubeflow          katib-mysql-7db488768f-jztqg                                   1/1     Running            1          5h47m   10.244.1.113   kubeflownode     <none>           <none>
kubeflow          katib-ui-6d7fbfffcb-wdhpz                                      1/1     Running            0          148m    10.244.1.82    kubeflownode     <none>           <none>
kubeflow          kfserving-controller-manager-0                                 2/2     Running            1          127m    10.244.1.105   kubeflownode     <none>           <none>
kubeflow          metacontroller-0                                               1/1     Running            0          7h45m   10.244.1.27    kubeflownode     <none>           <none>
kubeflow          metadata-db-5d56786648-k74ms                                   1/1     Running            0          5h48m   10.244.1.111   kubeflownode     <none>           <none>
kubeflow          metadata-deployment-5c7df888b9-28n8q                           1/1     Running            6          99m     10.244.1.110   kubeflownode     <none>           <none>
kubeflow          metadata-envoy-deployment-7cc78946c9-9k5k2                     1/1     Running            0          147m    10.244.1.85    kubeflownode     <none>           <none>
kubeflow          metadata-grpc-deployment-5c8545f76f-wxkvv                      1/1     Running            29         147m    10.244.1.86    kubeflownode     <none>           <none>
kubeflow          metadata-ui-665dff6f55-zb98f                                   1/1     Running            0          147m    10.244.1.87    kubeflownode     <none>           <none>
kubeflow          minio-657c66cd9-q2pdw                                          1/1     Running            0          5h47m   10.244.1.114   kubeflownode     <none>           <none>
kubeflow          ml-pipeline-669cdb6bdf-wt2hp                                   1/1     Running            0          12m     10.244.1.115   kubeflownode     <none>           <none>
kubeflow          ml-pipeline-ml-pipeline-visualizationserver-777d4b4645-pzcgh   1/1     Running            0          147m    10.244.1.90    kubeflownode     <none>           <none>
kubeflow          ml-pipeline-persistenceagent-56467f8856-5vnf4                  1/1     Running            21         147m    10.244.1.89    kubeflownode     <none>           <none>
kubeflow          ml-pipeline-scheduledworkflow-548b96d5fc-b6j5k                 1/1     Running            0          11m     10.244.1.119   kubeflownode     <none>           <none>
kubeflow          ml-pipeline-ui-6bd4778958-8v4x4                                1/1     Running            0          147m    10.244.1.92    kubeflownode     <none>           <none>
kubeflow          ml-pipeline-viewer-controller-deployment-bd64d97f9-cxh85       1/1     Running            0          125m    10.244.1.107   kubeflownode     <none>           <none>
kubeflow          mysql-8558d86476-6hc24                                         1/1     Running            0          5h47m   10.244.1.112   kubeflownode     <none>           <none>
kubeflow          notebook-controller-deployment-75bb4445c4-65n45                1/1     Running            0          123m    10.244.1.108   kubeflownode     <none>           <none>
kubeflow          profiles-deployment-5f7dd7567f-5nk82                           2/2     Running            0          121m    10.244.1.109   kubeflownode     <none>           <none>
kubeflow          pytorch-operator-6bc9c99c5-z5dng                               1/1     Running            1          12m     10.244.1.116   kubeflownode     <none>           <none>
kubeflow          seldon-controller-manager-786775d4d9-lr2lg                     1/1     Running            1          12m     10.244.1.117   kubeflownode     <none>           <none>
kubeflow          spark-operatorsparkoperator-9c559c997-664l9                    1/1     Running            0          146m    10.244.1.97    kubeflownode     <none>           <none>
kubeflow          spartakus-volunteer-5978bf56f-kq2hc                            1/1     Running            0          146m    10.244.1.98    kubeflownode     <none>           <none>
kubeflow          tensorboard-9b4c44f45-7qlcv                                    1/1     Running            0          5h47m   10.244.1.51    kubeflownode     <none>           <none>
kubeflow          tf-job-operator-5d7cc587c5-87nc6                               1/1     Running            2          11m     10.244.1.120   kubeflownode     <none>           <none>
kubeflow          workflow-controller-59ff5f7874-6fkgj                           1/1     Running            0          7h45m   10.244.1.25    kubeflownode     <none>           <none>

(11)启动kubeflow ui
如果以上部分正常走完,接下来使用k8s端口转发启动ui即可:

# 端口转发等待时长1 month
nohup kubectl port-forward -n istio-system svc/istio-ingressgateway 8088:80 --pod-running-timeout=720h --address=0.0.0.0 &

接下来去访问你的ui即可: http://hostip:8088
接下来去访问你的ui即可: http://10.4.7.20:8088

(12)若UI无法访问或者访问过慢

教程:https://www.cnblogs.com/xing901022/p/13455513.html
ui启动访问不了页面

export NAMESPACE=istio-system
kubectl port-forward --address 0.0.0.0 -n ${NAMESPACE} svc/istio-ingressgateway 8080:80

然后访问目标地址即可,如localhost:8080

 类似资料: