当前位置: 首页 > 知识库问答 >
问题:

Flink:无法在另一台机器上创建HDFS保存点

路金鑫
2023-03-14

我正在尝试在HDFS中使用Apache Flink 1.2创建一个保存点。我在我机器上的本地集群中运行Flink。HDFS在虚拟机中运行。我设法在Flink Streaming作业中写入HDFS,但保存点不会这样做。我的保存点路径是hdfs://hadoop: 54310/savepoint/testpoint,我在提交任务之前在UI中指定的。

它给我以下错误消息:(路径无效)

org.apache.flink.client.program.ProgramInvocationException: Failed to submit the job to the job manager
    at org.apache.flink.runtime.webmonitor.handlers.JarRunHandler.handleJsonRequest(JarRunHandler.java:64)
    at org.apache.flink.runtime.webmonitor.handlers.AbstractJsonRequestHandler.handleRequest(AbstractJsonRequestHandler.java:41)
    at org.apache.flink.runtime.webmonitor.RuntimeMonitorHandler.respondAsLeader(RuntimeMonitorHandler.java:98)
    at org.apache.flink.runtime.webmonitor.RuntimeMonitorHandlerBase.channelRead0(RuntimeMonitorHandlerBase.java:90)
    at org.apache.flink.runtime.webmonitor.RuntimeMonitorHandlerBase.channelRead0(RuntimeMonitorHandlerBase.java:44)
    at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
    at io.netty.handler.codec.http.router.Handler.routed(Handler.java:62)
    at io.netty.handler.codec.http.router.DualAbstractHandler.channelRead0(DualAbstractHandler.java:57)
    at io.netty.handler.codec.http.router.DualAbstractHandler.channelRead0(DualAbstractHandler.java:20)
    at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
    at org.apache.flink.runtime.webmonitor.HttpRequestHandler.channelRead0(HttpRequestHandler.java:159)
    at org.apache.flink.runtime.webmonitor.HttpRequestHandler.channelRead0(HttpRequestHandler.java:65)
    at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
    at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
    at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:242)
    at io.netty.channel.CombinedChannelDuplexHandler.channelRead(CombinedChannelDuplexHandler.java:147)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
    at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:847)
    at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
    at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
    at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
    at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.flink.runtime.client.JobExecutionException: JobSubmission failed: Invalid path 'hdfs://hadoop:54310/savepoint/testpoint'.
    at org.apache.flink.runtime.client.JobClient.submitJobDetached(JobClient.java:453)
    at org.apache.flink.runtime.webmonitor.handlers.JarRunHandler.handleJsonRequest(JarRunHandler.java:62)
    ... 34 more
Caused by: java.lang.IllegalArgumentException: Invalid path 'hdfs://hadoop:54310/savepoint/testpoint'.
    at org.apache.flink.runtime.checkpoint.savepoint.SavepointStore.createFsInputStream(SavepointStore.java:182)
    at org.apache.flink.runtime.checkpoint.savepoint.SavepointStore.loadSavepoint(SavepointStore.java:131)
    at org.apache.flink.runtime.checkpoint.savepoint.SavepointLoader.loadAndValidateSavepoint(SavepointLoader.java:64)
    at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1359)
    at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1341)
    at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1341)
    at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
    at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
    at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
    at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
    at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
    at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
    at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
    at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

作业内部配置:

val env = StreamExecutionEnvironment.getExecutionEnvironment

env.enableCheckpointing(60*1000)
// Setting up State Backend

// hdfs://hadoop:54310/checkpoint
env.setStateBackend(new FsStateBackend(lioncubConf.hdfsCheckpoint))

// Tries 3 times in 10 Seconds and waits for 5 Min 
env.setRestartStrategy(RestartStrategies.failureRateRestart(
    3, Time.of(5, TimeUnit.MINUTES), Time.of(10, TimeUnit.SECONDS)))

我不知道我做错了什么。在工作中,我设法写入HDFS(没有保存点)。因此HDFS不是问题。我还尝试定义了一个目录hdfs://hadoop: 54310/savepoint,但也不起作用。有什么想法吗?那个路径有什么问题?

共有1个答案

白信鸿
2023-03-14

问题是您试图提交的作业的保存点路径不存在。请检查您是否已获取存储在<代码>hdfs://hadoop:54310/savepoint/testpoint

为了触发保存点,您必须使用CLI并调用bin/flink savepoint: jobId[: Target etDirectory],其中Target etDirectory是可选参数。有关详细信息,请参阅保存点指南。

 类似资料:
  • 我正在尝试在Windows 10上安装Flink 1.11.2。我还安装了Cygwin以运行该命令/启动群集。嘘,开始Flink。我本想通过Chrome浏览器打开Flink的仪表板,但联系不到它。所以,我检查了日志文件,它说: “未正确指定VM选项‘MaxMetaspaceSize=268435456’错误:无法创建Java虚拟机。错误:发生致命异常。程序将退出。” 我还成功安装了java版本“1

  • 我使用kops设置了我的Kubernetes集群,并且是在本地机器上设置的。因此我的目录存储在本地计算机上,但我在中设置了状态存储。 我现在正在设置我的配置项服务器,我想从该框中运行我的命令。如何将现有状态导入该服务器?

  • 找到7个项目drwxr-xr-x-hbase用户0 201 4-06-25 18:58/hbase/.tmp ... 但当我运行此命令时,我会得到 yarn-site.xml Hbase配置hbase-site.xml 我可以浏览http://localhost:50070和http://localhost:8088/cluster 在hbase-marc-master-marc-pc.log中,

  • 到目前为止,这是我认为我需要的行,但不确定它是否正确。 如果有人能告诉我如何正确地保存训练过的模型,以及使用哪些变量来使用保存的模型进行推理,我会非常感激的。

  • 我有一个写日志到HDFS的Flume-ng。 我在一个节点中做了一个代理。 但是它没有运行。 这是我的配置。 #示例2.conf:单节点水槽配置 #命名这个代理上的组件 agent1.sources=源1 agent1.sinks=sink1 agent1.channels=channel1 agent1.sources.source1.type=avro agent1.sources.sourc