当前位置: 首页 > 知识库问答 >
问题:

Flink工作部署在库伯内特斯

富勇军
2023-03-14

我正在尝试在Kubernetes集群(Azure AKS)中部署Flink作业。作业群集在启动后立即中止,但任务管理器运行正常。

docker镜像创建成功,没有任何异常。我可以运行docker镜像,也可以SSHdocker镜像。

我已经按照以下链接中提到的步骤:

https://github.com/apache/flink/tree/release-1.9/flink-container/kubernetes

在创建映像时,我提供了作业jar,它已被复制到映像内的“/opt/工件”上。但仍然不明白为什么在作业集群pod日志中低于异常。

Caused by: org.apache.flink.util.FlinkException: Failed to find job JAR on class path. Please provide the job class name explicitly.

我是库伯内特斯新来的,你能给我一些调试这个问题的线索吗?

请查看以下完整日志:

A、 flink作业群集吊舱日志

develk@ACIDLAELKV01:~/cntx_eng$ kubectl logs flink-job-cluster-kszwf
Starting the job-cluster
Starting standalonejob as a console application on host flink-job-cluster-kszwf.
2019-12-12 10:37:17,170 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         - --------------------------------------------------------------------------------
2019-12-12 10:37:17,172 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -  Starting StandaloneJobClusterEntryPoint (Version: 1.8.0, Rev:4caec0d, Date:03.04.2019 @ 13:25:54 PDT)
2019-12-12 10:37:17,172 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -  OS current user: flink
2019-12-12 10:37:17,173 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -  Current Hadoop/Kerberos user: <no hadoop dependency found>
2019-12-12 10:37:17,173 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -  JVM: OpenJDK 64-Bit Server VM - IcedTea - 1.8/25.212-b04
2019-12-12 10:37:17,173 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -  Maximum heap size: 989 MiBytes
2019-12-12 10:37:17,173 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -  JAVA_HOME: /usr/lib/jvm/java-1.8-openjdk/jre
2019-12-12 10:37:17,174 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -  No Hadoop Dependency available
2019-12-12 10:37:17,174 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -  JVM Options:
2019-12-12 10:37:17,174 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -     -Xms1024m
2019-12-12 10:37:17,174 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -     -Xmx1024m
2019-12-12 10:37:17,174 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -     -Dlog4j.configuration=file:/opt/flink-1.8.0/conf/log4j-console.properties
2019-12-12 10:37:17,175 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -     -Dlogback.configurationFile=file:/opt/flink-1.8.0/conf/logback-console.xml
2019-12-12 10:37:17,175 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -  Program Arguments:
2019-12-12 10:37:17,175 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -     --configDir
2019-12-12 10:37:17,175 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -     /opt/flink-1.8.0/conf
2019-12-12 10:37:17,175 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -     -Djobmanager.rpc.address=flink-job-cluster
2019-12-12 10:37:17,175 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -     -Dparallelism.default=1
2019-12-12 10:37:17,176 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -     -Dblob.server.port=6124
2019-12-12 10:37:17,176 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -     -Dqueryable-state.server.ports=6125
2019-12-12 10:37:17,176 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -  Classpath: /opt/flink-1.8.0/lib/log4j-1.2.17.jar:/opt/flink-1.8.0/lib/slf4j-log4j12-1.7.15.jar:/opt/flink-1.8.0/lib/flink-dist_2.11-1.8.0.jar:::
2019-12-12 10:37:17,176 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         - --------------------------------------------------------------------------------
2019-12-12 10:37:17,178 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         - Registered UNIX signal handlers for [TERM, HUP, INT]
2019-12-12 10:37:17,306 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: jobmanager.rpc.address, localhost
2019-12-12 10:37:17,306 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: jobmanager.rpc.port, 6123
2019-12-12 10:37:17,307 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: jobmanager.heap.size, 1024m
2019-12-12 10:37:17,307 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: taskmanager.heap.size, 1024m
2019-12-12 10:37:17,307 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: taskmanager.numberOfTaskSlots, 1
2019-12-12 10:37:17,307 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: parallelism.default, 1
2019-12-12 10:37:17,336 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         - Starting StandaloneJobClusterEntryPoint.
2019-12-12 10:37:17,336 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         - Install default filesystem.
2019-12-12 10:37:17,343 INFO  org.apache.flink.core.fs.FileSystem                           - Hadoop is not in the classpath/dependencies. The extended set of supported File Systems via Hadoop is not available.
2019-12-12 10:37:17,352 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         - Install security context.
2019-12-12 10:37:17,362 INFO  org.apache.flink.runtime.security.modules.HadoopModuleFactory  - Cannot create Hadoop Security Module because Hadoop cannot be found in the Classpath.
2019-12-12 10:37:17,381 INFO  org.apache.flink.runtime.security.SecurityUtils               - Cannot install HadoopSecurityContext because Hadoop cannot be found in the Classpath.
2019-12-12 10:37:17,382 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         - Initializing cluster services.
2019-12-12 10:37:17,638 INFO  org.apache.flink.runtime.rpc.akka.AkkaRpcServiceUtils         - Trying to start actor system at flink-job-cluster:6123
2019-12-12 10:37:18,163 INFO  akka.event.slf4j.Slf4jLogger                                  - Slf4jLogger started
2019-12-12 10:37:18,237 INFO  akka.remote.Remoting                                          - Starting remoting
2019-12-12 10:37:18,366 INFO  akka.remote.Remoting                                          - Remoting started; listening on addresses :[akka.tcp://flink@flink-job-cluster:6123]
2019-12-12 10:37:18,375 INFO  org.apache.flink.runtime.rpc.akka.AkkaRpcServiceUtils         - Actor system started at akka.tcp://flink@flink-job-cluster:6123
2019-12-12 10:37:18,398 INFO  org.apache.flink.configuration.Configuration                  - Config uses fallback configuration key 'jobmanager.rpc.address' instead of key 'rest.address'
2019-12-12 10:37:18,407 INFO  org.apache.flink.runtime.blob.BlobServer                      - Created BLOB server storage directory /tmp/blobStore-63338044-67c1-4872-a3d9-c94563b3a7c3
2019-12-12 10:37:18,412 INFO  org.apache.flink.runtime.blob.BlobServer                      - Started BLOB server at 0.0.0.0:6124 - max concurrent requests: 50 - max backlog: 1000
2019-12-12 10:37:18,428 INFO  org.apache.flink.runtime.metrics.MetricRegistryImpl           - No metrics reporter configured, no metrics will be exposed/reported.
2019-12-12 10:37:18,430 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         - Trying to start actor system at flink-job-cluster:0
2019-12-12 10:37:18,464 INFO  akka.event.slf4j.Slf4jLogger                                  - Slf4jLogger started
2019-12-12 10:37:18,472 INFO  akka.remote.Remoting                                          - Starting remoting
2019-12-12 10:37:18,480 INFO  akka.remote.Remoting                                          - Remoting started; listening on addresses :[akka.tcp://flink-metrics@flink-job-cluster:33529]
2019-12-12 10:37:18,482 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         - Actor system started at akka.tcp://flink-metrics@flink-job-cluster:33529
2019-12-12 10:37:18,490 INFO  org.apache.flink.runtime.blob.TransientBlobCache              - Created BLOB cache storage directory /tmp/blobStore-ba64dcdb-5095-41fc-9c98-0f1528d95c40
2019-12-12 10:37:18,514 INFO  org.apache.flink.configuration.Configuration                  - Config uses fallback configuration key 'jobmanager.rpc.address' instead of key 'rest.address'
2019-12-12 10:37:18,515 WARN  org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint  - Upload directory /tmp/flink-web-f6be0c2d-5099-4bd6-bc72-a0ae1fc6448e/flink-web-upload does not exist, or has been deleted externally. Previously uploaded files are no longer available.
2019-12-12 10:37:18,516 INFO  org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint  - Created directory /tmp/flink-web-f6be0c2d-5099-4bd6-bc72-a0ae1fc6448e/flink-web-upload for file uploads.
2019-12-12 10:37:18,603 INFO  org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint  - Starting rest endpoint.
2019-12-12 10:37:18,872 WARN  org.apache.flink.runtime.webmonitor.WebMonitorUtils           - Log file environment variable 'log.file' is not set.
2019-12-12 10:37:18,872 WARN  org.apache.flink.runtime.webmonitor.WebMonitorUtils           - JobManager log files are unavailable in the web dashboard. Log file location not found in environment variable 'log.file' or configuration key 'Key: 'web.log.path' , default: null (fallback keys: [{key=jobmanager.web.log.path, isDeprecated=true}])'.
2019-12-12 10:37:19,115 INFO  org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint  - Rest endpoint listening at flink-job-cluster:8081
2019-12-12 10:37:19,116 INFO  org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint  - http://flink-job-cluster:8081 was granted leadership with leaderSessionID=00000000-0000-0000-0000-000000000000
2019-12-12 10:37:19,116 INFO  org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint  - Web frontend listening at http://flink-job-cluster:8081.
2019-12-12 10:37:19,239 INFO  org.apache.flink.runtime.rpc.akka.AkkaRpcService              - Starting RPC endpoint for org.apache.flink.runtime.resourcemanager.StandaloneResourceManager at akka://flink/user/resourcemanager .
2019-12-12 10:37:19,262 INFO  org.apache.flink.container.entrypoint.ClassPathJobGraphRetriever  - Scanning class path for job JAR
2019-12-12 10:37:19,270 INFO  org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint  - Shutting down rest endpoint.
2019-12-12 10:37:19,295 INFO  org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint  - Removing cache directory /tmp/flink-web-f6be0c2d-5099-4bd6-bc72-a0ae1fc6448e/flink-web-ui
2019-12-12 10:37:19,299 INFO  org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint  - http://flink-job-cluster:8081 lost leadership
2019-12-12 10:37:19,299 INFO  org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint  - Shut down complete.
2019-12-12 10:37:19,302 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         - Shutting StandaloneJobClusterEntryPoint down with application status FAILED. Diagnostics org.apache.flink.util.FlinkException: Could not create the DispatcherResourceManagerComponent.
        at org.apache.flink.runtime.entrypoint.component.AbstractDispatcherResourceManagerComponentFactory.create(AbstractDispatcherResourceManagerComponentFactory.java:257)
        at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runCluster(ClusterEntrypoint.java:224)
        at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.lambda$startCluster$0(ClusterEntrypoint.java:172)
        at org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30)
        at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:171)
        at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runClusterEntrypoint(ClusterEntrypoint.java:535)
        at org.apache.flink.container.entrypoint.StandaloneJobClusterEntryPoint.main(StandaloneJobClusterEntryPoint.java:105)
Caused by: org.apache.flink.util.FlinkException: Failed to find job JAR on class path. Please provide the job class name explicitly.
        at org.apache.flink.container.entrypoint.ClassPathJobGraphRetriever.getJobClassNameOrScanClassPath(ClassPathJobGraphRetriever.java:131)
        at org.apache.flink.container.entrypoint.ClassPathJobGraphRetriever.createPackagedProgram(ClassPathJobGraphRetriever.java:114)
        at org.apache.flink.container.entrypoint.ClassPathJobGraphRetriever.retrieveJobGraph(ClassPathJobGraphRetriever.java:96)
        at org.apache.flink.runtime.dispatcher.JobDispatcherFactory.createDispatcher(JobDispatcherFactory.java:62)
        at org.apache.flink.runtime.dispatcher.JobDispatcherFactory.createDispatcher(JobDispatcherFactory.java:41)
        at org.apache.flink.runtime.entrypoint.component.AbstractDispatcherResourceManagerComponentFactory.create(AbstractDispatcherResourceManagerComponentFactory.java:184)
        ... 6 more
Caused by: java.util.NoSuchElementException: No JAR with manifest attribute for entry class
        at org.apache.flink.container.entrypoint.JarManifestParser.findOnlyEntryClass(JarManifestParser.java:80)
        at org.apache.flink.container.entrypoint.ClassPathJobGraphRetriever.scanClassPathForJobJar(ClassPathJobGraphRetriever.java:137)
        at org.apache.flink.container.entrypoint.ClassPathJobGraphRetriever.getJobClassNameOrScanClassPath(ClassPathJobGraphRetriever.java:129)
        ... 11 more
.
2019-12-12 10:37:19,305 INFO  org.apache.flink.runtime.blob.BlobServer                      - Stopped BLOB server at 0.0.0.0:6124
2019-12-12 10:37:19,305 INFO  org.apache.flink.runtime.blob.TransientBlobCache              - Shutting down BLOB cache
2019-12-12 10:37:19,315 INFO  org.apache.flink.runtime.rpc.akka.AkkaRpcService              - Stopping Akka RPC service.
2019-12-12 10:37:19,320 INFO  akka.remote.RemoteActorRefProvider$RemotingTerminator         - Shutting down remote daemon.
2019-12-12 10:37:19,321 INFO  akka.remote.RemoteActorRefProvider$RemotingTerminator         - Shutting down remote daemon.
2019-12-12 10:37:19,323 INFO  akka.remote.RemoteActorRefProvider$RemotingTerminator         - Remote daemon shut down; proceeding with flushing remote transports.
2019-12-12 10:37:19,325 INFO  akka.remote.RemoteActorRefProvider$RemotingTerminator         - Remote daemon shut down; proceeding with flushing remote transports.
2019-12-12 10:37:19,354 INFO  akka.remote.RemoteActorRefProvider$RemotingTerminator         - Remoting shut down.
2019-12-12 10:37:19,356 INFO  akka.remote.RemoteActorRefProvider$RemotingTerminator         - Remoting shut down.
2019-12-12 10:37:19,378 INFO  org.apache.flink.runtime.rpc.akka.AkkaRpcService              - Stopped Akka RPC service.
2019-12-12 10:37:19,382 ERROR org.apache.flink.runtime.entrypoint.ClusterEntrypoint         - Could not start cluster entrypoint StandaloneJobClusterEntryPoint.
org.apache.flink.runtime.entrypoint.ClusterEntrypointException: Failed to initialize the cluster entrypoint StandaloneJobClusterEntryPoint.
        at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:190)
        at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runClusterEntrypoint(ClusterEntrypoint.java:535)
        at org.apache.flink.container.entrypoint.StandaloneJobClusterEntryPoint.main(StandaloneJobClusterEntryPoint.java:105)
Caused by: org.apache.flink.util.FlinkException: Could not create the DispatcherResourceManagerComponent.
        at org.apache.flink.runtime.entrypoint.component.AbstractDispatcherResourceManagerComponentFactory.create(AbstractDispatcherResourceManagerComponentFactory.java:257)
        at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runCluster(ClusterEntrypoint.java:224)
        at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.lambda$startCluster$0(ClusterEntrypoint.java:172)
        at org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30)
        at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:171)
        ... 2 more
Caused by: org.apache.flink.util.FlinkException: Failed to find job JAR on class path. Please provide the job class name explicitly.
        at org.apache.flink.container.entrypoint.ClassPathJobGraphRetriever.getJobClassNameOrScanClassPath(ClassPathJobGraphRetriever.java:131)
        at org.apache.flink.container.entrypoint.ClassPathJobGraphRetriever.createPackagedProgram(ClassPathJobGraphRetriever.java:114)
        at org.apache.flink.container.entrypoint.ClassPathJobGraphRetriever.retrieveJobGraph(ClassPathJobGraphRetriever.java:96)
        at org.apache.flink.runtime.dispatcher.JobDispatcherFactory.createDispatcher(JobDispatcherFactory.java:62)
        at org.apache.flink.runtime.dispatcher.JobDispatcherFactory.createDispatcher(JobDispatcherFactory.java:41)
        at org.apache.flink.runtime.entrypoint.component.AbstractDispatcherResourceManagerComponentFactory.create(AbstractDispatcherResourceManagerComponentFactory.java:184)
        ... 6 more
Caused by: java.util.NoSuchElementException: No JAR with manifest attribute for entry class
        at org.apache.flink.container.entrypoint.JarManifestParser.findOnlyEntryClass(JarManifestParser.java:80)
        at org.apache.flink.container.entrypoint.ClassPathJobGraphRetriever.scanClassPathForJobJar(ClassPathJobGraphRetriever.java:137)
        at org.apache.flink.container.entrypoint.ClassPathJobGraphRetriever.getJobClassNameOrScanClassPath(ClassPathJobGraphRetriever.java:129)
        ... 11 more
develk@ACIDLAELKV01:~/cntx_eng$

现在,我已经在“Job cluster Job.yaml.template”文件的参数部分添加了作业类名。

如下所示:

args: ["job-cluster", 
               "--job-classname", "com.flink.wordCountSimple",
               "-Djobmanager.rpc.address=flink-job-cluster",

但在那之后,我得到了以下例外:

Caused by: org.apache.flink.util.FlinkException: Could not load the provided entrypoint class.

请参阅下面的详细日志。

2019-12-13 19:08:34,323 INFO  org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint  - Shut down complete.
2019-12-13 19:08:34,329 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         - Shutting StandaloneJobClusterEntryPoint down with application status FAILED. Diagnostics org.apache.flink.util.FlinkException: Could not create the DispatcherResourceManagerComponent.
        at org.apache.flink.runtime.entrypoint.component.AbstractDispatcherResourceManagerComponentFactory.create(AbstractDispatcherResourceManagerComponentFactory.java:257)
        at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runCluster(ClusterEntrypoint.java:224)
        at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.lambda$startCluster$0(ClusterEntrypoint.java:172)
        at org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30)
        at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:171)
        at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runClusterEntrypoint(ClusterEntrypoint.java:535)
        at org.apache.flink.container.entrypoint.StandaloneJobClusterEntryPoint.main(StandaloneJobClusterEntryPoint.java:105)
Caused by: org.apache.flink.util.FlinkException: Could not load the provided entrypoint class.
        at org.apache.flink.container.entrypoint.ClassPathJobGraphRetriever.createPackagedProgram(ClassPathJobGraphRetriever.java:119)
        at org.apache.flink.container.entrypoint.ClassPathJobGraphRetriever.retrieveJobGraph(ClassPathJobGraphRetriever.java:96)
        at org.apache.flink.runtime.dispatcher.JobDispatcherFactory.createDispatcher(JobDispatcherFactory.java:62)
        at org.apache.flink.runtime.dispatcher.JobDispatcherFactory.createDispatcher(JobDispatcherFactory.java:41)
        at org.apache.flink.runtime.entrypoint.component.AbstractDispatcherResourceManagerComponentFactory.create(AbstractDispatcherResourceManagerComponentFactory.java:184)
        ... 6 more
Caused by: java.lang.ClassNotFoundException: com.flink.wordCountSimple
        at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        at org.apache.flink.container.entrypoint.ClassPathJobGraphRetriever.createPackagedProgram(ClassPathJobGraphRetriever.java:116)
        ... 10 more
.
2019-12-13 19:08:34,337 INFO  org.apache.flink.runtime.blob.BlobServer                      - Stopped BLOB server at 0.0.0.0:6124
2019-12-13 19:08:34,338 INFO  org.apache.flink.runtime.blob.TransientBlobCache              - Shutting down BLOB cache
2019-12-13 19:08:34,364 INFO  org.apache.flink.runtime.rpc.akka.AkkaRpcService              - Stopping Akka RPC service.
2019-12-13 19:08:34,368 INFO  akka.remote.RemoteActorRefProvider$RemotingTerminator         - Shutting down remote daemon.
2019-12-13 19:08:34,372 INFO  akka.remote.RemoteActorRefProvider$RemotingTerminator         - Remote daemon shut down; proceeding with flushing remote transports.
2019-12-13 19:08:34,392 INFO  akka.remote.RemoteActorRefProvider$RemotingTerminator         - Shutting down remote daemon.
2019-12-13 19:08:34,392 INFO  akka.remote.RemoteActorRefProvider$RemotingTerminator         - Remote daemon shut down; proceeding with flushing remote transports.
2019-12-13 19:08:34,406 INFO  akka.remote.RemoteActorRefProvider$RemotingTerminator         - Remoting shut down.
2019-12-13 19:08:34,410 INFO  akka.remote.RemoteActorRefProvider$RemotingTerminator         - Remoting shut down.
2019-12-13 19:08:34,434 INFO  org.apache.flink.runtime.rpc.akka.AkkaRpcService              - Stopped Akka RPC service.
2019-12-13 19:08:34,443 ERROR org.apache.flink.runtime.entrypoint.ClusterEntrypoint         - Could not start cluster entrypoint StandaloneJobClusterEntryPoint.
org.apache.flink.runtime.entrypoint.ClusterEntrypointException: Failed to initialize the cluster entrypoint StandaloneJobClusterEntryPoint.
        at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:190)
        at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runClusterEntrypoint(ClusterEntrypoint.java:535)
        at org.apache.flink.container.entrypoint.StandaloneJobClusterEntryPoint.main(StandaloneJobClusterEntryPoint.java:105)
Caused by: org.apache.flink.util.FlinkException: Could not create the DispatcherResourceManagerComponent.
        at org.apache.flink.runtime.entrypoint.component.AbstractDispatcherResourceManagerComponentFactory.create(AbstractDispatcherResourceManagerComponentFactory.java:257)
        at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runCluster(ClusterEntrypoint.java:224)
        at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.lambda$startCluster$0(ClusterEntrypoint.java:172)
        at org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30)
        at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:171)
        ... 2 more
Caused by: org.apache.flink.util.FlinkException: Could not load the provided entrypoint class.
        at org.apache.flink.container.entrypoint.ClassPathJobGraphRetriever.createPackagedProgram(ClassPathJobGraphRetriever.java:119)
        at org.apache.flink.container.entrypoint.ClassPathJobGraphRetriever.retrieveJobGraph(ClassPathJobGraphRetriever.java:96)
        at org.apache.flink.runtime.dispatcher.JobDispatcherFactory.createDispatcher(JobDispatcherFactory.java:62)
        at org.apache.flink.runtime.dispatcher.JobDispatcherFactory.createDispatcher(JobDispatcherFactory.java:41)
        at org.apache.flink.runtime.entrypoint.component.AbstractDispatcherResourceManagerComponentFactory.create(AbstractDispatcherResourceManagerComponentFactory.java:184)
        ... 6 more
Caused by: java.lang.ClassNotFoundException: com.flink.wordCountSimple
        at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        at org.apache.flink.container.entrypoint.ClassPathJobGraphRetriever.createPackagedProgram(ClassPathJobGraphRetriever.java:116)
        ... 10 more

共有2个答案

易博文
2023-03-14
version: "2.1"
services:
  jobmanager:
    build:
      context: ./
      args: 
        JAR_FILE: flink-event-tracker-bundled-1.6.0.jar
    image: test/flink-event-tracker
    expose:
      - "6123"
    ports:
      - "8081:8081"
      - "6123:6123"
    command: job-cluster --job-classname com.company.test.flink.pipelines.KafkaPipelineConsumer -Djobmanager.rpc.address=jobmanager --runner=FlinkRunner --streaming=true --checkpointingInterval=30000
    environment:
      - JOB_MANAGER_RPC_ADDRESS=jobmanager
      - JOB_MANAGER=jobmanager
    volumes:
      - data-volume:/docker/volumes

  taskmanager:
    image: test/flink-event-tracker
    expose:
      - "6121"
      - "6122"
    depends_on:
      - jobmanager
    command: task-manager -Djobmanager.rpc.address=jobmanager
    links:
      - "jobmanager:jobmanager"
    environment:
      - JOB_MANAGER_RPC_ADDRESS=jobmanager
      - JOB_MANAGER=jobmanager
    volumes:
      - data-volume:/docker/volumes

volumes:
  data-volume:
    driver: local
    driver_opts:
        o: bind
        type: none
        device: /Users/home/Development/docker/volumes/flink

Docker文件

FROM flink:1.9

ARG JAR_FILE=""

ENV APP_OPTS ""
ENV JAVA_OPTS ""
ENV JOB_MANAGER=""

# Build arg allows passing the version at runtime
ARG VERSION=unset-version

COPY flink-conf.yml $FLINK_HOME/conf/flink-conf.yaml

COPY  target/$JAR_FILE $FLINK_HOME/lib/event-tracker.jar

COPY docker-cluster-entrypoint.sh /docker-cluster-entrypoint.sh

RUN apt-get update && apt-get install procps -y && apt-get install curl -y
RUN echo "root:root" | chpasswd

RUN chmod 777 /docker-cluster-entrypoint.sh
RUN chmod 777 $FLINK_HOME/lib/event-tracker.jar

ENTRYPOINT [ "bash","/docker-cluster-entrypoint.sh" ]

docker群集入口点。上海

FLINK_HOME=${FLINK_HOME:-"/opt/flink/bin"}

JOB_CLUSTER="job-cluster"
TASK_MANAGER="task-manager"

CMD="$1"
shift;

if [ "${CMD}" = "--help" -o "${CMD}" = "-h" ]; then
    echo "Usage: $(basename $0) (${JOB_CLUSTER}|${TASK_MANAGER})"
    exit 0
elif [ "${CMD}" = "${JOB_CLUSTER}" -o "${CMD}" = "${TASK_MANAGER}" ]; then
    echo "Starting the ${CMD}"

    if [ "${CMD}" = "${TASK_MANAGER}" ]; then
        exec $FLINK_HOME/bin/taskmanager.sh start-foreground "$@"
    else
        exec $FLINK_HOME/bin/standalone-job.sh start-foreground "$@"
    fi
fi
mvn clean install

docker-compose -f docker-compose.local.yml up --scale taskmanager=2 >  exceptionlog.log

docker-compose -f docker-compose.local.yml build

这是运行docker的整个conf。但如果您想在kube中运行,只需将docker compose文件转换为其相应的kube文件。。。剩余部分可以保持不变。。可能是这样做的,库贝维护更好。

注意:-我们使用apache beam对作业进行编码

郎魁
2023-03-14

在kubernetes上有一个创建和运行flink作业集群的完整工作示例https://github.com/alpinegizmo/flink-containers-example.也许这会有帮助。另请参见https://www.youtube.com/watch?v=ceZtUDgh2TE.

 类似资料:
  • 据我所知,作业对象应该在一定时间后收获豆荚。但是在我的GKE集群(库伯内特斯1.1.8)上,“kubectl get pods-a”似乎可以列出几天前的豆荚。 所有这些都是使用乔布斯API创建的。 我确实注意到在使用 kubectl 删除作业后,pod 也被删除了。 我在这里主要担心的是,我将在批量作业中在集群上运行成千上万个pod,并且不想让内部待办系统过载。

  • 我有以下代码: 我创建了一个包含上述Python代码的映像的部署。 当我使用my Python代码不会创建sig文件指示,也不会打印“完成”消息。 点击此链接:https://cloud.google.com/blog/products/containers-kubernetes/kubernetes-best-practices-terminating-with-grace我看到k8s发送SIG

  • 我确实部署了单吊舱,自定义docker映像如下: 在开发过程中,我希望推送新的最新版本并更新部署。如果不明确定义标记/版本并为每个构建增加它,就找不到如何做到这一点,并且

  • 我在windows 10中创建了两个在我的minikube环境中运行的POD。一个POD带有Spring boot应用程序容器,另一个POD带有mysql容器。对于Spring boot应用程序,服务类型为nodePort,对于MYSQL pod,服务类型为club sterIP。这意味着Mysql pod只需要在集群内部进行通信。但是对于Spring boot应用程序,需要从浏览器访问,所以我配

  • 在了解了可以传递给Java 8虚拟机以使其具有容器感知能力的参数(即-XX:UnlockExperimentalVMOptions-XX:UseCGroupMemoryLimitForHeap)之后,我试图将这些参数添加到我的Kubernetes部署中,用于Spring Boot服务。 在部署YAML文件的容器部分,我有以下内容: 在我的Dockerfile中,我有: 我似乎不明白为什么最大堆大小

  • 我希望能够使用 Jenkins 执行的 Helm 图表(作为构建周期的一部分)自动部署到我的 Kubernetes 集群。詹金斯机器位于与 Kubernetes 集群不同的网络上(而不是许多博客中记录的它的一部分)。 我有一个托管在私人GitHub帐户中的图表存储库。我遵循以下过程:https://hackernoon.com/using-a-private-github-repo-as-helm