当前位置: 首页 > 知识库问答 >
问题:

Flume hdfs接收器不断制作. tmp文件

鄂和璧
2023-03-14

某些HDFS接收器文件未关闭

有人说,如果接收器进程因超时条件等问题而失败,它不会再次尝试关闭文件。

我已经查看了水槽日志文件,但没有错误。然而,日志文件显示,每个周期,flume生成两个tmp文件,只关闭一个tmp。。。

对于配置的任何建议将不胜感激!谢谢!

#Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1

#Configure the Kafka Source
a1.sources.r1.type = org.apache.flume.source.kafka.KafkaSource
a1.sources.r1.batchSize = 1000
#a1.sources.r1.batchDurationMillis = 2000
a1.sources.r1.kafka.bootstrap.servers = 150.2.237.16:6667,150.2.237.17:6667
a1.sources.r1.kafka.topics = 1-sysmaster1-thread
a1.sources.r1.kafka.consumer.group.id = flume_hdfs_consumer

#Describe the sink
a1.sinks.k1.type = hdfs
a1.sinks.k1.hdfs.path = /user/flume/kafka-data/1-sysmaster1-thread/%y%m%d
a1.sinks.k1.hdfs.filePrefix = 1-sysmaster1-thread-%H%M

#Describing sink with the problem of Encoding
a1.sinks.k1.hdfs.fileType = DataStream
a1.sinks.k1.hdfs.writeFormat = Text

#Describing sink with the problem of many hdfs files
### Roll a file after certain amount of events occurs  ###
a1.sinks.k1.hdfs.rollInterval = 0
a1.sinks.k1.hdfs.rollSize = 0
a1.sinks.k1.hdfs.rollCount = 10000
a1.sinks.k1.hdfs.batchSize = 1000

#Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 10000
a1.channels.c1.transactionCapacity = 1000

#Use File channel
#a1.channels.c1.type = file
#a1.channels.cl.checkpointDir = /home/bigdata/flume/checkpoint
#a1.channels.c1.dataDirs = /home/bigdata/flume/data

#Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
23 4월 2019 11:47:04,105 INFO  [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.BucketWriter.open:246)  - Creating /user/flume/kafka-data/1-sysmaster1-thread/190423/1-sysmaster1-thread-1147.1555987622865.tmp
23 4월 2019 11:48:03,382 INFO  [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.HDFSDataStream.configure:57)  - Serializer = TEXT, UseRawLocalFileSystem = false
23 4월 2019 11:48:03,457 INFO  [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.BucketWriter.open:246)  - Creating /user/flume/kafka-data/1-sysmaster1-thread/190423/1-sysmaster1-thread-1148.1555987683383.tmp
23 4월 2019 11:48:08,664 INFO  [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.BucketWriter.doClose:438)  - Closing /user/flume/kafka-data/1-sysmaster1-thread/190423/1-sysmaster1-thread-1148.1555987683383.tmp
23 4월 2019 11:48:08,689 INFO  [hdfs-k1-call-runner-8] (org.apache.flume.sink.hdfs.BucketWriter$7.call:681)  - Renaming /user/flume/kafka-data/1-sysmaster1-thread/190423/1-sysmaster1-thread-1148.1555987683383.tmp to /user/flume/kafka-data/1-sysmaster1-thread/190423/1-sysmaster1-thread-1148.1555987683383
23 4월 2019 11:48:08,712 INFO  [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.BucketWriter.open:246)  - Creating /user/flume/kafka-data/1-sysmaster1-thread/190423/1-sysmaster1-thread-1148.1555987683384.tmp
23 4월 2019 11:49:03,711 INFO  [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.HDFSDataStream.configure:57)  - Serializer = TEXT, UseRawLocalFileSystem = false
23 4월 2019 11:49:03,806 INFO  [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.BucketWriter.open:246)  - Creating /user/flume/kafka-data/1-sysmaster1-thread/190423/1-sysmaster1-thread-1149.1555987743712.tmp
23 4월 2019 11:49:05,439 INFO  [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.BucketWriter.doClose:438)  - Closing /user/flume/kafka-data/1-sysmaster1-thread/190423/1-sysmaster1-thread-1149.1555987743712.tmp
23 4월 2019 11:49:05,460 INFO  [hdfs-k1-call-runner-5] (org.apache.flume.sink.hdfs.BucketWriter$7.call:681)  - Renaming /user/flume/kafka-data/1-sysmaster1-thread/190423/1-sysmaster1-thread-1149.1555987743712.tmp to /user/flume/kafka-data/1-sysmaster1-thread/190423/1-sysmaster1-thread-1149.1555987743712
23 4월 2019 11:49:05,480 INFO  [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.BucketWriter.open:246)  - Creating /user/flume/kafka-data/1-sysmaster1-thread/190423/1-sysmaster1-thread-1149.1555987743713.tmp
23 4월 2019 11:50:02,354 INFO  [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.HDFSDataStream.configure:57)  - Serializer = TEXT, UseRawLocalFileSystem = false
23 4월 2019 11:50:02,387 INFO  [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.BucketWriter.open:246)  - Creating /user/flume/kafka-data/1-sysmaster1-thread/190423/1-sysmaster1-thread-1150.1555987802355.tmp
23 4월 2019 11:50:03,015 INFO  [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.BucketWriter.doClose:438)  - Closing /user/flume/kafka-data/1-sysmaster1-thread/190423/1-sysmaster1-thread-1150.1555987802355.tmp
23 4월 2019 11:50:03,032 INFO  [hdfs-k1-call-runner-4] (org.apache.flume.sink.hdfs.BucketWriter$7.call:681)  - Renaming /user/flume/kafka-data/1-sysmaster1-thread/190423/1-sysmaster1-thread-1150.1555987802355.tmp to /user/flume/kafka-data/1-sysmaster1-thread/190423/1-sysmaster1-thread-1150.1555987802355
[root@sd-mds-01 logs]# hdfs dfs -ls /user/flume/kafka-data/1-sysmaster1-thread/190423/
Found 163 items
-rw-r--r--   3 root hdfs    1781109 2019-04-23 11:20 /user/flume/kafka-data/1-sysmaster1-thread/190423/1-sysmaster1-thread-1120.1555986001199
-rw-r--r--   3 root hdfs     212118 2019-04-23 11:20 /user/flume/kafka-data/1-sysmaster1-thread/190423/1-sysmaster1-thread-1120.1555986001200.tmp
-rw-r--r--   3 root hdfs    1777270 2019-04-23 11:21 /user/flume/kafka-data/1-sysmaster1-thread/190423/1-sysmaster1-thread-1121.1555986062575
-rw-r--r--   3 root hdfs      54451 2019-04-23 11:21 /user/flume/kafka-data/1-sysmaster1-thread/190423/1-sysmaster1-thread-1121.1555986062576.tmp
-rw-r--r--   3 root hdfs    1781741 2019-04-23 11:22 /user/flume/kafka-data/1-sysmaster1-thread/190423/1-sysmaster1-thread-1122.1555986123181
-rw-r--r--   3 root hdfs      34735 2019-04-23 11:22 /user/flume/kafka-data/1-sysmaster1-thread/190423/1-sysmaster1-thread-1122.1555986123182.tmp
-rw-r--r--   3 root hdfs    1782315 2019-04-23 11:23 /user/flume/kafka-data/1-sysmaster1-thread/190423/1-sysmaster1-thread-1123.1555986183768
-rw-r--r--   3 root hdfs      28682 2019-04-23 11:23 /user/flume/kafka-data/1-sysmaster1-thread/190423/1-sysmaster1-thread-1123.1555986183769.tmp
-rw-r--r--   3 root hdfs    1782437 2019-04-23 11:24 /user/flume/kafka-data/1-sysmaster1-thread/190423/1-sysmaster1-thread-1124.1555986244304
-rw-r--r--   3 root hdfs     211547 2019-04-23 11:24 /user/flume/kafka-data/1-sysmaster1-thread/190423/1-sysmaster1-thread-1124.1555986244305.tmp
-rw-r--r--   3 root hdfs    1782775 2019-04-23 11:25 /user/flume/kafka-data/1-sysmaster1-thread/190423/1-sysmaster1-thread-1125.1555986302891
-rw-r--r--   3 root hdfs      35918 2019-04-23 11:25 /user/flume/kafka-data/1-sysmaster1-thread/190423/1-sysmaster1-thread-1125.1555986302892.tmp
-rw-r--r--   3 root hdfs    1781180 2019-04-23 11:26 /user/flume/kafka-data/1-sysmaster1-thread/190423/1-sysmaster1-thread-1126.1555986362097
-rw-r--r--   3 root hdfs      30967 2019-04-23 11:26 /user/flume/kafka-data/1-sysmaster1-thread/190423/1-sysmaster1-thread-1126.1555986362098.tmp
-rw-r--r--   3 root hdfs    1781682 2019-04-23 11:27 /user/flume/kafka-data/1-sysmaster1-thread/190423/1-sysmaster1-thread-1127.1555986423432
-rw-r--r--   3 root hdfs      41381 2019-04-23 11:27 /user/flume/kafka-data/1-sysmaster1-thread/190423/1-sysmaster1-thread-1127.1555986423433.tmp
-rw-r--r--   3 root hdfs    1781710 2019-04-23 11:28 /user/flume/kafka-data/1-sysmaster1-thread/190423/1-sysmaster1-thread-1128.1555986483928
-rw-r--r--   3 root hdfs     211240 2019-04-23 11:28 /user/flume/kafka-data/1-sysmaster1-thread/190423/1-sysmaster1-thread-1128.1555986483929.tmp
-rw-r--r--   3 root hdfs    1785456 2019-04-23 11:29 /user/flume/kafka-data/1-sysmaster1-thread/190423/1-sysmaster1-thread-1129.1555986542442

共有1个答案

暨弘毅
2023-03-14

我发现了问题所在…

我将配置设置为使用时间前缀滚动文件

a1.sinks.k1.hdfs.filePrefix = 1-sysmaster1-thread-%H%M

正如您可以看到的结果路径,每一个文件在一分钟结束时,它不能正确地汇总

在我从配置文件中删除该行后,它工作正常。

 类似资料:
  • 问题内容: 好的,我已经尝试了Stack上的所有解决方案,但没有任何效果。我当前的方法从MainActivity注册了“ SmsListener”接收器。我要做的就是初始化onReceive方法。没有错误;它根本不是在广播。我究竟做错了什么?在此处粘贴适用的代码。可能需要的其他任何东西都可以问。 更新:这是一个未解决的类似问题,当 我在Android6.0.1下测试我正在测试的GoogleHang

  • 错误如下: resources/Config/vich_uploader中的配置: 救命啊!

  • 我是AWS的亲戚,所以如果这是一个愚蠢的问题,请道歉。 我有一个用Java编写的AWS Lambda。我还有一个接收AWS S3事件消息的SQS队列。然后,我针对SQS队列创建了一个Lambda触发器,以便我的Lambda将S3事件作为SQS消息接收并适当地处理它们。 如您所见,SQS事件对象不为null,但在toString()调用中不生成任何内容。 我不知道是什么问题--任何帮助都将不胜感激。

  • 嗨,我试图捕捉短信内容和使用我的应用程序,所以我做了一个广播接收器与许可和清单,但当设备接收短信,我的代码不运行,这意味着广播接收器不发射。我也查了这里里里外外的很多文章,有一些: Android短信接收结果到主要活动短信接收不工作 我还尝试在活动onCreate()中动态注册接收器,但没有任何变化 有人知道问题出在哪里吗?它应该只是庆祝一个消息被累犯,这样我就可以继续工作,但接收器似乎甚至没有发

  • 我刚把Nexus5更新到Android6,直到现在我的应用程序还能正常工作,但现在广播接收器却不工作了。新版本有什么变化吗?这是我试过的在以前的版本上工作的代码,但在Marshmallow中不行- Android清单 同样,PHONE_STATE的广播接收器也不工作。

  • 问题内容: 我正在尝试开发一个可检测用户何时拍照的应用程序。我设置了广播接收器类,并通过以下方式将其注册到清单文件中: 无论我做什么,该程序都不会收到广播。这是我的接收器类: 如果删除清单和活动中的mimeType行,则使用以下命令发送自己的广播 然后我成功接收到广播,可以看到日志和吐司窗口​​。我是否采用正确的方法?有什么需要补充的吗? 问题答案: 我解决了这个问题,但是使用了另一种方法。我没有