Spark PI demo with spark-on-k8s-operator

段干弘毅
2023-12-01

安装 spark-on-k8s-operator

helm repo add incubator http://storage.googleapis.com/kubernetes-charts-incubator
helm install incubator/sparkoperator --namespace spark-operator

NAME:   eponymous-flee
LAST DEPLOYED: Tue Apr  9 15:22:24 2019
NAMESPACE: spark-operator
STATUS: DEPLOYED

RESOURCES:
==> v1/ClusterRole
NAME                             AGE
eponymous-flee-sparkoperator-cr  1s

==> v1/ClusterRoleBinding
NAME                              AGE
eponymous-flee-sparkoperator-crb  1s

==> v1/Deployment
NAME                          READY  UP-TO-DATE  AVAILABLE  AGE
eponymous-flee-sparkoperator  0/1    1           0          1s

==> v1/Pod(related)
NAME                                             READY  STATUS             RESTARTS  AGE
eponymous-flee-sparkoperator-549fd5fbb5-hq8b6    0/1    ContainerCreating  0         1s
singing-mandrill-sparkoperator-6f6cc7f5bf-zm8ql  1/1    Running            0         90m

==> v1/ServiceAccount
NAME                          SECRETS  AGE
eponymous-flee-spark          1        1s
eponymous-flee-sparkoperator  1        1s

run spark-pi

kubectl create serviceaccount spark
kubectl create clusterrolebinding spark-role --clusterrole=edit --serviceaccount=default:spark --namespace=default

clusterrolebinding.rbac.authorization.k8s.io/spark-role-de created

进入 spark-on-k8s-operator-1beta1-0.8.1-2.4.0目录下
加载配置
kubectl apply -f examples/spark-pi.yaml

查看sparkapplication
kubectl describe sparkapplication spark-pi

Events:
  Type    Reason                     Age                From            Message
  ----    ------                     ----               ----            -------
  Normal  SparkApplicationAdded      30m                spark-operator  SparkApplication spark-pi was added, enqueuing it for submission
  Normal  SparkApplicationAdded      30m                spark-operator  SparkApplication spark-pi was added, enqueuing it for submission
  Normal  SparkApplicationSubmitted  30m                spark-operator  SparkApplication spark-pi was submitted successfully
  Normal  SparkDriverRunning         30m                spark-operator  Driver spark-pi-driver is running
  Normal  SparkDriverRunning         30m                spark-operator  Driver spark-pi-driver is running
  Normal  SparkExecutorRunning       28m                spark-operator  Executor spark-pi-1554795261890-exec-1 is running
  Normal  SparkExecutorRunning       28m                spark-operator  Executor spark-pi-1554795261890-exec-1 is running
  Normal  SparkExecutorPending       28m (x3 over 28m)  spark-operator  Executor spark-pi-1554795261890-exec-1 is pending
  Normal  SparkExecutorPending       28m (x3 over 28m)  spark-operator  Executor spark-pi-1554795261890-exec-1 is pending
  Normal  SparkDriverCompleted       28m (x2 over 28m)  spark-operator  Driver spark-pi-driver completed
  Normal  SparkDriverCompleted       28m                spark-operator  Driver spark-pi-driver completed
  Normal  SparkApplicationCompleted  28m (x2 over 28m)  spark-operator  SparkApplication spark-pi completed
  Normal  SparkApplicationCompleted  28m                spark-operator  SparkApplication spark-pi completed

查看 运行状况

kubectl get pods

NAME                                               READY   STATUS      RESTARTS   AGE
spark-pi-2b608f780e8b396492abc330cf6dc2a6-driver   0/1     Completed   0          4h36m
spark-pi-ba6a0d78f88437738ff357a93b2c4ae1-driver   0/1     Completed   0          4h19m
spark-pi-d9139d8e335933c7891a817de670193e-driver   0/1     Completed   0          4h18m
spark-pi-driver                                    1/1     Running     0          61s
spark-pi-f3a5946156cb3128a579dd0b50e9d528-driver   0/1     Error       0          4h41m

kubectl logs -f spark-pi-driver

2019-04-09 07:36:16 INFO  SparkContext:54 - Starting job: reduce at SparkPi.scala:38
2019-04-09 07:36:16 INFO  DAGScheduler:54 - Got job 0 (reduce at SparkPi.scala:38) with 2 output partitions
2019-04-09 07:36:16 INFO  DAGScheduler:54 - Final stage: ResultStage 0 (reduce at SparkPi.scala:38)
2019-04-09 07:36:16 INFO  DAGScheduler:54 - Parents of final stage: List()
2019-04-09 07:36:16 INFO  DAGScheduler:54 - Missing parents: List()
2019-04-09 07:36:16 INFO  DAGScheduler:54 - Submitting ResultStage 0 (MapPartitionsRDD[1] at map at SparkPi.scala:34), which has no missing parents
2019-04-09 07:36:19 INFO  MemoryStore:54 - Block broadcast_0 stored as values in memory (estimated size 1936.0 B, free 117.0 MB)
2019-04-09 07:36:20 INFO  MemoryStore:54 - Block broadcast_0_piece0 stored as bytes in memory (estimated size 1256.0 B, free 117.0 MB)
2019-04-09 07:36:20 INFO  BlockManagerInfo:54 - Added broadcast_0_piece0 in memory on spark-pi-1554795261890-driver-svc.default.svc:7079 (size: 1256.0 B, free: 117.0 MB)
2019-04-09 07:36:20 INFO  SparkContext:54 - Created broadcast 0 from broadcast at DAGScheduler.scala:1161
2019-04-09 07:36:21 INFO  DAGScheduler:54 - Submitting 2 missing tasks from ResultStage 0 (MapPartitionsRDD[1] at map at SparkPi.scala:34) (first 15 tasks are for partitions Vector(0, 1))
2019-04-09 07:36:21 INFO  TaskSchedulerImpl:54 - Adding task set 0.0 with 2 tasks
2019-04-09 07:36:21 INFO  TaskSetManager:54 - Starting task 0.0 in stage 0.0 (TID 0, 172.17.0.8, executor 1, partition 0, PROCESS_LOCAL, 7878 bytes)
2019-04-09 07:36:24 INFO  BlockManagerInfo:54 - Added broadcast_0_piece0 in memory on 172.17.0.8:33775 (size: 1256.0 B, free: 117.0 MB)
2019-04-09 07:36:24 INFO  TaskSetManager:54 - Starting task 1.0 in stage 0.0 (TID 1, 172.17.0.8, executor 1, partition 1, PROCESS_LOCAL, 7878 bytes)
2019-04-09 07:36:24 INFO  TaskSetManager:54 - Finished task 0.0 in stage 0.0 (TID 0) in 3290 ms on 172.17.0.8 (executor 1) (1/2)
2019-04-09 07:36:25 INFO  TaskSetManager:54 - Finished task 1.0 in stage 0.0 (TID 1) in 501 ms on 172.17.0.8 (executor 1) (2/2)
2019-04-09 07:36:25 INFO  TaskSchedulerImpl:54 - Removed TaskSet 0.0, whose tasks have all completed, from pool 
2019-04-09 07:36:25 INFO  DAGScheduler:54 - ResultStage 0 (reduce at SparkPi.scala:38) finished in 7.700 s
2019-04-09 07:36:25 INFO  DAGScheduler:54 - Job 0 finished: reduce at SparkPi.scala:38, took 9.276735 s
Pi is roughly 3.1375956879784397
2019-04-09 07:36:25 INFO  AbstractConnector:318 - Stopped Spark@5e76a2bb{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
2019-04-09 07:36:26 INFO  SparkUI:54 - Stopped Spark web UI at http://spark-pi-1554795261890-driver-svc.default.svc:4040
2019-04-09 07:36:26 INFO  KubernetesClusterSchedulerBackend:54 - Shutting down all executors
2019-04-09 07:36:26 INFO  KubernetesClusterSchedulerBackend$KubernetesDriverEndpoint:54 - Asking each executor to shut down
2019-04-09 07:36:26 WARN  ExecutorPodsWatchSnapshotSource:87 - Kubernetes client has been closed (this is expected if the application is shutting down.)
2019-04-09 07:36:27 INFO  MapOutputTrackerMasterEndpoint:54 - MapOutputTrackerMasterEndpoint stopped!
2019-04-09 07:36:28 INFO  MemoryStore:54 - MemoryStore cleared
2019-04-09 07:36:28 INFO  BlockManager:54 - BlockManager stopped
2019-04-09 07:36:28 INFO  BlockManagerMaster:54 - BlockManagerMaster stopped
2019-04-09 07:36:28 INFO  OutputCommitCoordinator$OutputCommitCoordinatorEndpoint:54 - OutputCommitCoordinator stopped!
2019-04-09 07:36:28 INFO  SparkContext:54 - Successfully stopped SparkContext
2019-04-09 07:36:28 INFO  ShutdownHookManager:54 - Shutdown hook called
2019-04-09 07:36:28 INFO  ShutdownHookManager:54 - Deleting directory /var/data/spark-c4f52332-6c5b-46a6-9b33-e358d8a2e19c/spark-4bb919f7-a20b-4536-a66e-436339ede1c0
2019-04-09 07:36:28 INFO  ShutdownHookManager:54 - Deleting directory /tmp/spark-54f3e7c3-e5d6-4630-970c-334a9bc6a7be

Everything is OK.

 类似资料: