当前位置: 首页 > 知识库问答 >
问题:

在库伯内特斯的远程Flink集群上运行Apache Beam作业的问题

太叔坚
2023-03-14

我有一个部署在远程Kubernetes集群上的Flink SessionCluster(根据文档),可在

但是,每次我收到错误——看起来我无法成功提交作业以供执行。错误日志和Beam的Pipeline选项如下;我将非常感谢有关如何解决此问题的一些提示(我不是经验丰富的Flink/Beam用户,因此请原谅任何明显的错误)。

管道选项:

PipelineOptions(
    "--runner=FlinkRunner",
    "--flink_master=<FLINK_MASTER_URL>:8081"
)

错误日志(已截断):

WARNING:root:Make sure that locally built Python SDK docker image has Python 3.7 interpreter.
INFO:root:Using Python SDK docker image: apache/beam_python3.7_sdk:2.21.0. If the image is not available at local, we will try to pull from hub.docker.com
INFO:apache_beam.runners.portability.fn_api_runner.translations:==================== <function lift_combiners at 0x7f954007e710> ====================
INFO:apache_beam.runners.portability.flink_runner:Adding HTTP protocol scheme to flink_master parameter: http://<FLINK_MASTER_URL>:8081
INFO:apache_beam.utils.subprocess_server:Using cached job server jar from https://repo.maven.apache.org/maven2/org/apache/beam/beam-runners-flink-1.10-job-server/2.21.0/beam-runners-flink-1.10-job-server-2.21.0.jar
INFO:apache_beam.utils.subprocess_server:Starting service with ['java' '-jar' '/home/rjurczak/.apache_beam/cache/jars/beam-runners-flink-1.10-job-server-2.21.0.jar' '--flink-master' 'http://<FLINK_MASTER_URL>:8081' '--artifacts-dir' '/tmp/beam-tempht7lpipz/artifactsotk2otzl' '--job-port' '48375' '--artifact-port' '0' '--expansion-port' '0']
INFO:apache_beam.utils.subprocess_server:b'[main] INFO org.apache.beam.runners.fnexecution.jobsubmission.JobServerDriver - LegacyArtifactStagingService started on localhost:37645'
INFO:apache_beam.utils.subprocess_server:b'[main] INFO org.apache.beam.runners.fnexecution.jobsubmission.JobServerDriver - Java ExpansionService started on localhost:35547'
INFO:apache_beam.utilradars.subprocess_server:b'[main] INFO org.apache.beam.runners.fnexecution.jobsubmission.JobServerDriver - JobService started on localhost:48375'
INFO:apache_beam.utils.subprocess_server:b'[grpc-default-executor-0] INFO org.apache.beam.runners.flink.FlinkJobInvoker - Invoking job BeamApp-rjurczak-0713164027-2a729669_d00db59c-cda9-46be-9bd8-1b8406d155a5 with pipeline runner org.apache.beam.runners.flink.FlinkPipelineRunner@25fa0e2d'
INFO:apache_beam.utils.subprocess_server:b'[grpc-default-executor-0] INFO org.apache.beam.runners.fnexecution.jobsubmission.JobInvocation - Starting job invocation BeamApp-rjurczak-0713164027-2a729669_d00db59c-cda9-46be-9bd8-1b8406d155a5'
INFO:apache_beam.runners.portability.portable_runner:Job state changed to STOPPED
INFO:apache_beam.runners.portability.portable_runner:Job state changed to STARTING
INFO:apache_beam.runners.portability.portable_runner:Job state changed to RUNNING
INFO:apache_beam.utils.subprocess_server:b'[flink-runner-job-invoker] INFO org.apache.beam.runners.flink.FlinkPipelineRunner - Translating pipeline to Flink program.'
INFO:apache_beam.utils.subprocess_server:b'[flink-runner-job-invoker] INFO org.apache.beam.runners.flink.FlinkExecutionEnvironments - Creating a Batch Execution Environment.'
INFO:apache_beam.utilradars.subprocess_server:b'[flink-runner-job-invoker] INFO org.apache.beam.runners.flink.FlinkExecutionEnvironments - Using Flink Master URL 10.70.227.141:8081.'
INFO:apache_beam.utils.subprocess_server:b'[flink-runner-job-invoker] INFO org.apache.flink.api.java.ExecutionEnvironment - The job has 0 registered types and 0 default Kryo serializers'
INFO:apache_beam.utils.subprocess_server:b'[Flink-RestClusterClient-IO-thread-4] WARN org.apache.flink.util.ExecutorUtils - ExecutorService did not terminate in time. Shutting it down now.'
INFO:apache_beam.utils.subprocess_server:b'[flink-runner-job-invoker] ERROR org.apache.beam.runners.fnexecution.jobsubmission.JobInvocation - Error during job invocation BeamApp-rjurczak-0713164027-2a729669_d00db59c-cda9-46be-9bd8-1b8406d155a5.'
INFO:apache_beam.utils.subprocess_server:b'java.lang.RuntimeException: java.util.concurrent.ExecutionException: org.apache.flink.runtime.client.JobSubmissionException: Failed to submit JobGraph.'
INFO:apache_beam.utils.subprocess_server:b'\tat org.apache.flink.utilradar.ExceptionUtils.rethrow(ExceptionUtils.java:199)'
INFO:apache_beam.utils.subprocess_server:b'\tat org.apache.flink.api.java.ExecutionEnvironment.executeAsync(ExecutionEnvironment.java:952)'
INFO:apache_beam.utils.subprocess_server:b'\tat org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:860)'
INFO:apache_beam.utils.subprocess_server:b'\tat org.apache.beam.runners.flink.FlinkBatchPortablePipelineTranslator$BatchTranslationContext.execute(FlinkBatchPortablePipelineTranslator.java:194)'
INFO:apache_beam.utils.subprocess_server:b'\tat org.apache.beam.runners.flink.FlinkPipelineRunner.runPipelineWithTranslator(FlinkPipelineRunner.java:116)'
INFO:apache_beam.utils.subprocess_server:b'\tat org.apache.beam.runners.flink.FlinkPipelineRunner.run(FlinkPipelineRunner.java:83)'
INFO:apache_beam.utils.subprocess_server:b'\tat org.apache.beam.runners.fnexecution.jobsubmission.JobInvocation.runPipeline(JobInvocation.java:83)'radarradar
INFO:apache_beam.utils.subprocess_server:b'\tat org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)'
INFO:apache_beam.utils.subprocess_server:b'\tat org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:57)'
INFO:apache_beam.utils.subprocess_server:b'\tat org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)'
INFO:apache_beam.utils.subprocess_server:b'\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)'
INFO:apache_beam.utils.subprocess_server:b'\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)'
INFO:apache_beam.utils.subprocess_server:b'\tat java.base/java.lang.Thread.run(Thread.java:834)'
INFO:apache_beam.utils.subprocess_server:b'Caused by: java.util.concurrent.ExecutionException: org.apache.flink.runtime.client.JobSubmissionException: Failed to submit JobGraph.'
INFO:apache_beam.utils.subprocess_server:b'\tat java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395)'
INFO:apache_beam.utils.subprocess_server:b'\tat java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1999)'
INFO:apache_beam.utils.subprocess_server:b'\tat org.apache.flink.api.java.ExecutionEnvironment.executeAsync(ExecutionEnvironment.java:947)'
INFO:apache_beam.utils.subprocess_server:b'\t... 11 more'
INFO:apache_beam.utils.subprocess_server:b'Caused by: org.apache.flink.runtime.client.JobSubmissionException: Failed to submit JobGraph.'
INFO:apache_beam.utils.subprocess_server:b'\tat org.apache.flink.client.program.rest.RestClusterClient.lambda$submitJob$7(RestClusterClient.java:359)'
INFO:apache_beam.utils.subprocess_server:b'\tat java.base/java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:986)'
INFO:apache_beam.utils.subprocess_server:b'\tat java.base/java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:970)'
INFO:apache_beam.utils.subprocess_server:b'\tat java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)'
INFO:apache_beam.utils.subprocess_server:b'\tat java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088)'
INFO:apache_beam.utils.subprocess_server:b'\tat org.apache.flink.runtime.concurrent.FutureUtils.lambda$retryOperationWithDelay$8(FutureUtils.java:274)'
INFO:apache_beam.utils.subprocess_server:b'\tat java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:859)'
INFO:apache_beam.utils.subprocess_server:b'\tat java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:837)'
INFO:apache_beam.utils.subprocess_server:b'\tat java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)'
INFO:apache_beam.utils.subprocess_server:b'\tat java.base/java.util.concurrent.CompletableFuture.postFire(CompletableFuture.java:610)'
INFO:apache_beam.utils.subprocess_server:b'\tat java.base/java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:1085)'
INFO:apache_beam.utils.subprocess_server:b'\tat java.base/java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:478)'
INFO:apache_beam.utils.subprocess_server:b'\t... 3 more'
INFO:apache_beam.utils.subprocess_server:b'Caused by: org.apache.flink.runtime.rest.util.RestClientException: [Failed to deserialize JobGraph.]'
INFO:apache_beam.utils.subprocess_server:b'\tat org.apache.flink.runtime.rest.RestClient.parseResponse(RestClient.java:390)'
INFO:apache_beam.utils.subprocess_server:b'\tat org.apache.flink.runtime.rest.RestClient.lambda$submitRequest$3(RestClient.java:374)'
INFO:apache_beam.utils.subprocess_server:b'\tat java.base/java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:1072)'
INFO:apache_beam.utils.subprocess_server:b'\t... 4 more'
ERROR:root:org.apache.flink.runtime.rest.util.RestClientException: [Failed to deserialize JobGraph.]
INFO:apache_beam.utils.subprocess_server:b'[flink-runner-job-invoker] INFO org.apache.beam.runners.fnexecution.artifact.AbstractLegacyArtifactRetrievalService - Manifest at /tmp/beam-tempht7lpipz/artifactsotk2otzl/job_364b1df0-7e66-4759-997f-91f87179932b/MANIFEST has 1 artifact locations'
INFO:apache_beam.runners.portability.portable_runner:Job state changed to FAILED
Traceback (most recent call last):
  File "examples/wordcount.py", line 152, in <module>
    run()
  File "examples/wordcount.py", line 132, in run
    result.wait_until_finish()
  File "/home/rjurczak/envs/env/lib/python3.7/site-packages/apache_beam/runners/portability/portable_runner.py", line 550, in wait_until_finish
    (self._job_id, self._state, self._last_error_message()))
RuntimeError: Pipeline BeamApp-rjurczak-0713164027-2a729669_d00db59c-cda9-46be-9bd8-1b8406d155a5 failed in state FAILED: org.apache.flink.runtime.rest.util.RestClientException: [Failed to deserialize JobGraph.]

共有1个答案

司徒正信
2023-03-14

这是一个相当简短的回答。看起来与这里的错误相同。

确保CLI Flink版本与在Kubernetes上运行的Flink master的版本相匹配。

 类似资料:
  • 我正在尝试在Kubernetes集群(Azure AKS)中部署Flink作业。作业群集在启动后立即中止,但任务管理器运行正常。 docker镜像创建成功,没有任何异常。我可以运行docker镜像,也可以SSHdocker镜像。 我已经按照以下链接中提到的步骤: https://github.com/apache/flink/tree/release-1.9/flink-container/kub

  • 我假设没有愚蠢的问题,所以这里有一个我找不到直接答案的问题。 现在的情况 我目前有一个运行1.15的Kubernetes集群。AKS上的x,通过Terraform部署和管理。AKS最近宣布Azure将在AKS上停用Kubernetes的1.15版本,我需要将集群升级到1.16或更高版本。现在,据我所知,直接在Azure中升级集群不会对集群的内容产生任何影响,即节点、豆荚、秘密和当前在那里的所有其他

  • 我正在尝试让cadence在kubernetes集群上运行。然而,我注意到Cadence服务器初始化中有一个bug,它阻止Cassandra脚本正确初始化模式。https://github.com/uber/cadence/issues/1713:所以我想我会手动完成这一步。我执行了以下步骤- < li >在docker compose上从https://raw . githubuserconte

  • 目前,我为我的Kubernetes资源创建了Helm图表,并试图从配置Helm客户端和kubectl的本地机器部署到远程Kubernetes集群上。我用下面的命令创建了舵图, 创建后,我编辑了my-图表/values.yaml.中的图像值现在我需要在我的远程库伯内特斯集群上部署这个docker图像 我的困惑 这里我的困惑是,当我部署时,我只需要使用“helm install”命令吗?它会部署在我的

  • 我一直在努力让DNS插件在CentOS 7.2集群上工作。我使用以下说明安装了群集:http://severalnines.com/blog/installing-kubernetes-cluster-minions-centos7-manage-pods-services 在此配置中,主服务器正在运行:etcd、库贝-调度器、库贝-apiserver和库贝-控制器-管理器。这些节点正在运行:do