当前位置: 首页 > 知识库问答 >
问题:

Apache Flink作业群集rpc。在kubernetes上绑定到本地主机的地址

相旭
2023-03-14

我正在尝试在kubernetes环境中运行Flink作业集群(1.8.1)。我使用此文档使用我的作业jar创建了docker映像。

按照kubefiles创建作业、作业管理器和任务管理器。问题是任务管理器无法连接到作业管理器,并持续崩溃。

调试作业管理器日志时,jobmanager。rpc。地址绑定到“localhost”。

但根据这份文件,我已经通过了kube文件中的参数。

我还尝试设置jobmanager。rpc。env变量中的地址(FLINK\u env\u JAVA\u OPTS)。

  env:
          - name: FLINK_ENV_JAVA_OPTS
            value: "-Djobmanager.rpc.address=flink-job-cluster"

作业管理器控制台日志:

Starting the job-cluster
Starting standalonejob as a console application on host flink-job-cluster-bbxrn.
2019-07-16 17:31:10,759 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         - --------------------------------------------------------------------------------
2019-07-16 17:31:10,760 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -  Starting StandaloneJobClusterEntryPoint (Version: <unknown>, Rev:4caec0d, Date:03.04.2019 @ 13:25:54 PDT)
2019-07-16 17:31:10,760 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -  OS current user: flink
2019-07-16 17:31:10,761 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -  Current Hadoop/Kerberos user: <no hadoop dependency found>
2019-07-16 17:31:10,761 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -  JVM: OpenJDK 64-Bit Server VM - IcedTea - 1.8/25.212-b04
2019-07-16 17:31:10,761 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -  Maximum heap size: 989 MiBytes
2019-07-16 17:31:10,761 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -  JAVA_HOME: /usr/lib/jvm/java-1.8-openjdk/jre
2019-07-16 17:31:10,761 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -  No Hadoop Dependency available
2019-07-16 17:31:10,761 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -  JVM Options:
2019-07-16 17:31:10,761 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -     -Xms1024m
2019-07-16 17:31:10,761 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -     -Xmx1024m
2019-07-16 17:31:10,762 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -     -Djobmanager.rpc.address=flink-job-cluster
2019-07-16 17:31:10,762 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -     -Dlog4j.configuration=file:/opt/flink-1.8.1/conf/log4j-console.properties
2019-07-16 17:31:10,762 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -     -Dlogback.configurationFile=file:/opt/flink-1.8.1/conf/logback-console.xml
2019-07-16 17:31:10,762 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -  Program Arguments:
2019-07-16 17:31:10,762 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -     --configDir
2019-07-16 17:31:10,762 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -     /opt/flink-1.8.1/conf
2019-07-16 17:31:10,762 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -     --job-classname
2019-07-16 17:31:10,762 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -     wikiedits.WikipediaAnalysis
2019-07-16 17:31:10,762 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -     --host
2019-07-16 17:31:10,762 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -     flink-job-cluster
2019-07-16 17:31:10,762 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -     -Djobmanager.rpc.address=flink-job-cluster
2019-07-16 17:31:10,763 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -     -Dparallelism.default=2
2019-07-16 17:31:10,763 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -     -Dblob.server.port=6124
2019-07-16 17:31:10,763 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -     -Dqueryable-state.server.ports=6125
2019-07-16 17:31:10,763 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -  Classpath: /opt/flink-1.8.1/lib/log4j-1.2.17.jar:/opt/flink-1.8.1/lib/slf4j-log4j12-1.7.15.jar:/opt/flink-1.8.1/lib/wiki-edits-0.1.jar:/opt/flink-1.8.1/lib/flink-dist_2.11-1.8.1.jar:::
2019-07-16 17:31:10,763 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         - --------------------------------------------------------------------------------
2019-07-16 17:31:10,764 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         - Registered UNIX signal handlers for [TERM, HUP, INT]
2019-07-16 17:31:10,850 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: jobmanager.rpc.address, localhost
2019-07-16 17:31:10,851 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: jobmanager.rpc.port, 6123
2019-07-16 17:31:10,851 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: jobmanager.heap.size, 1024m
2019-07-16 17:31:10,851 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: taskmanager.heap.size, 1024m
2019-07-16 17:31:10,851 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: taskmanager.numberOfTaskSlots, 1
2019-07-16 17:31:10,851 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: parallelism.default, 1

以上日志显示rpc。地址绑定到本地主机,而不是flink作业群集。

我假设任务管理器的消息在绑定到localhost 6123时被Akka rpc丢弃。

2019-07-16 17:31:12,546 INFO  org.apache.flink.runtime.resourcemanager.StandaloneResourceManager  - Request slot with profile ResourceProfile{cpuCores=-1.0, heapMemoryInMB=-1, directMemoryInMB=0, nativeMemoryInMB=0, networkMemoryInMB=0} for job 38190f2570cd5f0a0a47f65ddf7aae1f with allocation id 97af00eae7e3dfb31a79232077ea7ee6.
2019-07-16 17:31:14,043 ERROR akka.remote.EndpointWriter                                    - dropping message [class akka.actor.ActorSelectionMessage] for non-local recipient [Actor[akka.tcp://flink@flink-job-cluster:6123/]] arriving at [akka.tcp://flink@flink-job-cluster:6123] inbound addresses are [akka.tcp://flink@localhost:6123]
2019-07-16 17:31:26,564 ERROR akka.remote.EndpointWriter                                    - dropping message [class akka.actor.ActorSelectionMessage] for non-local recipient [Actor[akka.tcp://flink@flink-job-cluster:6123/]] arriving at [akka.tcp://flink@flink-job-cluster:6123] inbound addresses are [akka.tcp://flink@localhost:6123]

不知道为什么工作经理绑定到localhost。

备注:任务管理器吊舱可以解析flink作业集群主机。主机名解析为服务ip地址。


共有1个答案

昝光临
2023-03-14

问题的根本原因是jobmanager.rpc.addressarg值没有应用。不知何故,行内args没有正确设置为flink全局配置。但是作为多行列表传递的args工作正常。

 类似资料:
  • 我尝试了本地minikube和kubeadm Kubernetes集群,并通过https://github.com/jenkinsci/kubernetes-plugin运行Jenkins服务器,并使用。 以下是了解这个问题的更多细节。

  • null 当我尝试使用这些浮动IPs和标准公共IPs时,我遇到了问题。 在spark-master计算机上,主机名为spark-master,/etc/hosts类似于 对spark-env.sh所做的唯一更改是。如果我运行,我可以查看web UI。 您的主机名spark-master解析为环回地址:127.0.1.1;使用192.x.x.1代替(在接口eth0)16/05/12 15:05:33

  • 我有一个php脚本,想按计划运行它。我在windows上使用本地web服务器(WAMP服务器),需要一种每10分钟运行< code>my_script.php的方法。 如何在WINDOWS中的本地主机上在PHP脚本上运行cron作业?

  • 我刚刚安装了XAMPP-Win32-5.5.30,在xampp控制面板中,Apache和mysql都启动了,没有任何错误,但我发现: 1)浏览器中的localhost重定向到另一个页面localhost/dashboard/而不是xampp起始页。

  • 我已经在我的Windows7机器上设置了一个本地spark集群(一个主节点和辅助节点)。我已经创建了一个简单的scala脚本,我用sbt构建了这个脚本,并尝试用Spark-Submit运行这个脚本。请参阅以下资源 Scala代码: 现在,我用sbt构建并打包scala代码,并将其打包到一个JAR中。我的build.sbt文件如下所示 它创建一个jar,我使用spark submit命令提交它,如下

  • 我有一个RMI服务器在本地主机上运行,导出类型的对象: 然后我想启动一个客户端并获取该对象的存根。似乎找到了注册表,但随后在块中抛出了: java.rmi.NotBoundexception:M在sun.rmi.registry.registryimpl.lookup(registryimpl.java:136)在sun.rmi.server.unicastserverref.olddispatc