当前位置: 首页 > 知识库问答 >
问题:

在kubernetes(独立驱动程序)中初始化spark上下文时出现网络超时

堵睿范
2023-03-14
20/04/29 02:14:46 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://sparkrunner-0.sparkrunner:4040
20/04/29 02:14:46 INFO SparkKubernetesClientFactory: Auto-configuring K8S client using current context from users K8S config file
20/04/29 02:14:46 DEBUG Config: Trying to configure client from Kubernetes config...
20/04/29 02:14:46 DEBUG Config: Did not find Kubernetes config at: [/root/.kube/config]. Ignoring.
20/04/29 02:14:46 DEBUG Config: Trying to configure client from service account...
20/04/29 02:14:46 DEBUG Config: Found service account host and port: 10.96.0.1:443
20/04/29 02:14:46 DEBUG Config: Found service account ca cert at: [/var/run/secrets/kubernetes.io/serviceaccount/ca.crt].
20/04/29 02:14:46 DEBUG Config: Found service account token at: [/var/run/secrets/kubernetes.io/serviceaccount/token].
20/04/29 02:14:46 DEBUG Config: Trying to configure client namespace from Kubernetes service account namespace path...
20/04/29 02:14:46 DEBUG Config: Found service account namespace at: [/var/run/secrets/kubernetes.io/serviceaccount/namespace].
20/04/29 02:14:46 DEBUG Config: Trying to configure client from Kubernetes config...
20/04/29 02:14:46 DEBUG Config: Did not find Kubernetes config at: [/root/.kube/config]. Ignoring.
20/04/29 02:14:46 DEBUG Config: Trying to configure client from service account...
20/04/29 02:14:46 DEBUG Config: Found service account host and port: 10.96.0.1:443
20/04/29 02:14:46 DEBUG Config: Found service account ca cert at: [/var/run/secrets/kubernetes.io/serviceaccount/ca.crt].
20/04/29 02:14:46 DEBUG Config: Found service account token at: [/var/run/secrets/kubernetes.io/serviceaccount/token].
20/04/29 02:14:46 DEBUG Config: Trying to configure client namespace from Kubernetes service account namespace path...
20/04/29 02:14:46 DEBUG Config: Found service account namespace at: [/var/run/secrets/kubernetes.io/serviceaccount/namespace].
20/04/29 02:14:57 ERROR SparkContext: Error initializing SparkContext.
org.apache.spark.SparkException: External scheduler cannot be instantiated
        at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2934)
        at org.apache.spark.SparkContext.<init>(SparkContext.scala:548)
        at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2578)
        at org.apache.spark.sql.SparkSession$Builder.$anonfun$getOrCreate$5(SparkSession.scala:896)
        at scala.Option.getOrElse(Option.scala:138)
        at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:887)
        at sparkrunner.sparklibs.SparkSystem$.<init>(SparkSystem.scala:22)
        at sparkrunner.sparklibs.SparkSystem$.<clinit>(SparkSystem.scala)
        at sparkrunner.actors.RecipeManager$$anonfun$receive$1.applyOrElse(RecipeManager.scala:41)
        at akka.actor.Actor.aroundReceive(Actor.scala:534)
        at akka.actor.Actor.aroundReceive$(Actor.scala:532)
        at sparkrunner.actors.RecipeManager.aroundReceive(RecipeManager.scala:20)
        at akka.actor.ActorCell.receiveMessage(ActorCell.scala:573)
        at akka.actor.ActorCell.invoke(ActorCell.scala:543)
        at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:269)
        at akka.dispatch.Mailbox.run(Mailbox.scala:230)
        at akka.dispatch.Mailbox.exec(Mailbox.scala:242)
        at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
        at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
        at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
        at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Operation: [get]  for kind: [Pod]  with name: [sparkrunner-0]  in namespace: [default]  failed.
        at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:64)
        at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:72)
        at io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:237)
        at io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:170)
        at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$driverPod$1(ExecutorPodsAllocator.scala:59)
        at scala.Option.map(Option.scala:163)
        at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.<init>(ExecutorPodsAllocator.scala:58)
        at org.apache.spark.scheduler.cluster.k8s.KubernetesClusterManager.createSchedulerBackend(KubernetesClusterManager.scala:113)
        at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2928)
        ... 20 more
Caused by: java.net.SocketTimeoutException: connect timed out
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
        at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
        at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
        at java.net.Socket.connect(Socket.java:589)
        at okhttp3.internal.platform.Platform.connectSocket(Platform.java:129)
        at okhttp3.internal.connection.RealConnection.connectSocket(RealConnection.java:247)
        at okhttp3.internal.connection.RealConnection.connect(RealConnection.java:167)
        at okhttp3.internal.connection.StreamAllocation.findConnection(StreamAllocation.java:258)
        at okhttp3.internal.connection.StreamAllocation.findHealthyConnection(StreamAllocation.java:135)
        at okhttp3.internal.connection.StreamAllocation.newStream(StreamAllocation.java:114)
        at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:42)
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
        at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:93)
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
        at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93)
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
        at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:127)
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
        at io.fabric8.kubernetes.client.utils.BackwardsCompatibilityInterceptor.intercept(BackwardsCompatibilityInterceptor.java:119)
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
        at io.fabric8.kubernetes.client.utils.ImpersonatorInterceptor.intercept(ImpersonatorInterceptor.java:68)
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
        at io.fabric8.kubernetes.client.utils.HttpClientUtils.lambda$createHttpClient$3(HttpClientUtils.java:111)
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
        at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:257)
        at okhttp3.RealCall.execute(RealCall.java:93)
        at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:411)
        at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:372)
        at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:337)
        at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:318)
        at io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleGet(BaseOperation.java:833)
        at io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:226)
        ... 26 more
20/04/29 02:14:57 DEBUG AbstractLifeCycle: stopping Server@68d79eec{STARTED}[9.4.z-SNAPSHOT]
20/04/29 02:14:57 DEBUG Server: doStop Server@68d79eec{STOPPING}[9.4.z-SNAPSHOT]
20/04/29 02:14:57 DEBUG QueuedThreadPool: ran SparkUI-59-acceptor-0@2b94b939-ServerConnector@79ce3216{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
20/04/29 02:14:57 DEBUG AbstractHandlerContainer: Graceful shutdown Server@68d79eec{STOPPING}[9.4.z-SNAPSHOT] by 
20/04/29 02:14:57 DEBUG AbstractLifeCycle: stopping Spark@79ce3216{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
20/04/29 02:14:57 DEBUG AbstractLifeCycle: stopping SelectorManager@Spark@79ce3216{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
20/04/29 02:14:57 DEBUG AbstractLifeCycle: stopping ManagedSelector@8993a98{STARTED} id=3 keys=0 selected=0 updates=0
➜  kubectl version
Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.0", GitCommit:"9e991415386e4cf155a24b1da15becaa390438d8", GitTreeState:"clean", BuildDate:"2020-03-26T06:16:15Z", GoVersion:"go1.14", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.0", GitCommit:"9e991415386e4cf155a24b1da15becaa390438d8", GitTreeState:"clean", BuildDate:"2020-03-25T14:50:46Z", GoVersion:"go1.13.8", Compiler:"gc", Platform:"linux/amd64"}

我已经按照下面的描述设置了集群:

https://spark.apache.org/docs/latest/running-on-kubernetes.html

kubernetes客户端似乎无法与API通信?我想弄明白为什么。

还有什么想法可以尝试调试这个问题吗?

共有1个答案

诸葛品
2023-03-14

这里的问题似乎与k8s api有关,报告者如下:

kubectl cluster-info

该命令将得到以下地址:

k8s:/https:/kubernetes.default.svc:32768

 类似资料:
  • null 所以目前一切都在无头模式下工作。但是我需要在一个没有GUI的linux服务器上运行这个python脚本(因此需要headless模式)。出于某种原因,我所有无头运行的尝试都导致selenium web驱动程序在初始化时超时。 以下是回溯: 文件“C:\users\xuser\desktop\bomwebservice\backend\code\firefoxstarter.py”,第51

  • 初始化 Lotus 驱动程序 当安装 Lotus 数据库的驱动程序时,设置程序会在引擎的 Windows 注册表和 ISAM 格式子键写入一些缺省值。不要直接修改这些设置;请使用应用程序的设置程序来添加、删除、或更改这些设置。下面部分描述 Lotus 数据库驱动程序的初始化和 ISAM Format 设置。 Lotus 初始化设置 Jet\3.5\Engines\Lotus 文件夹包含用来访问外部

  • 我试图在Python中初始化火花上下文变量。 但我得到了以下错误: py4j。协议Py4JJavaError:调用None时出错。组织。阿帕奇。火花应用程序编程接口。JAVAJavaSparkContext.:JAVAlang.NoClassDefFoundError:无法初始化类组织。阿帕奇。火花内部的配置。组织上的包$ 。阿帕奇。火花斯帕克孔夫。在组织上验证设置(SparkConf.scala

  • 初始化 Microsoft Excel 驱动程序 当安装 Microsoft Excel 数据库的驱动程序时,设置程序会在引擎的 Windows 注册表和 ISAM 格式子键写入一些缺省值。不要直接修改这些设置;请使用应用程序的设置程序来添加、删除、或更改这些设置。下列部分描述 Microsoft Excel 数据库驱动程序的初始化和 ISAM Formats 设置。 Microsoft Exce

  • null sbin/start-slave.sh spark://c96___37fb:7077--用于并置从机的端口7078 sbin/start-slave.sh spark://masternodeip:7077--其他两个从机的端口7078 前面引用的所有端口都从nodeMaster重定向到相应的Docker。 因此,webUI向我显示,我的集群有3个连接的节点,不幸的是,当运行时,只有并

  • 尝试使用ZooKeeper和SparkDriver弹性实现SparkMaster的高可用性,使用GlusterFS中的元数据检查点。 null 驾驶员保持在停车状态。驱动程序错误日志- 我是否为Spark选择了正确的资源控制器,即Statefulsets of kubernetes?我对这个环境是新的,任何帮助都是非常值得赞赏的。