我试图配置水槽与HDFS作为汇。
这是我的flume.conf文件:
agent1.channels.ch1.type = memory
agent1.sources.avro-source1.channels = ch1
agent1.sources.avro-source1.type = avro
agent1.sources.avro-source1.bind = 0.0.0.0
agent1.sources.avro-source1.port = 41414
agent1.sinks.log-sink1.type = logger
agent1.sinks.hdfs-sink.channel=ch1
agent1.sinks.hdfs-sink.type=hdfs
agent1.sinks.hdfs-sink.hdfs.path=hdfs://localhost:9000/flume/flumehdfs/
agent1.sinks.hdfs-sink.hdfs.fileType = DataStream
agent1.sinks.hdfs-sink.hdfs.writeFormat = Text
agent1.sinks.hdfs-sink.hdfs.batchSize = 1000
agent1.sinks.hdfs-sink.hdfs.rollSize = 0
agent1.sinks.hdfs-sink.hdfs.rollCount = 10000
agent1.sinks.hdfs-sink.hdfs.rollInterval = 600
agent1.channels = ch1
agent1.sources = avro-source1
agent1.sinks = log-sink1 hdfs-sink
我的hadoop版本是:
Hadoop 0.20.2
Subversion https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707
水槽版本是:
apache-flume-1.4.0
我已将这两个jar文件放在flume/lib目录中
hadoop-0.20.2-core
hadoop-common-0.22.0
我将hadoop common jar放在那里,因为在启动flume代理时出现以下错误:
Unhandled error
java.lang.NoSuchMethodError: org.apache.hadoop.security.UserGroupInformation.isSecurityEnabled()Z
at org.apache.flume.sink.hdfs.HDFSEventSink.authenticate(HDFSEventSink.java:491)
at org.apache.flume.sink.hdfs.HDFSEventSink.configure(HDFSEventSink.java:240)
at org.apache.flume.conf.Configurables.configure(Configurables.java:41)
at org.apache.flume.node.AbstractConfigurationProvider.loadSinks(AbstractConfigurationProvider.java:418)
at org.apache.flume.node.AbstractConfigurationProvider.getConfiguration(AbstractConfigurationProvider.java:103)
at org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:140)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:165)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:267)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:679)
现在代理开始了。这是启动日志:
logger=DEBUG,console Info: Including Hadoop libraries found via (/home/user/Downloads/hadoop-0.20.2/bin/hadoop) for HDFS access Exception in thread "main" java.lang.NoClassDefFoundError: classpath Caused by: java.lang.ClassNotFoundException: classpath at java.net.URLClassLoader$1.run(URLClassLoader.java:217) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:205) at java.lang.ClassLoader.loadClass(ClassLoader.java:321) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294) at java.lang.ClassLoader.loadClass(ClassLoader.java:266) Could not find the main class: classpath. Program will exit. + exec /usr/lib/jvm/default-java/bin/java -Xmx20m -Dflume.root.logger=DEBUG,console -cp '/home/user/Downloads/apache-flume-1.4.0-bin/conf:/home/user/Downloads/apache-flume-1.4.0-bin/lib/*' -Djava.library.path=:/home/user/Downloads/hadoop-0.20.2/bin/../lib/native/Linux-amd64-64 org.apache.flume.node.Application -n agent1 -f ./conf/flume.conf 2013-09-04 07:55:22,634 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.node.PollingPropertiesFileConfigurationProvider.start(PollingPropertiesFileConfigurationProvider.java:61)] Configuration provider starting 2013-09-04 07:55:22,639 (lifecycleSupervisor-1-0) [DEBUG - org.apache.flume.node.PollingPropertiesFileConfigurationProvider.start(PollingPropertiesFileConfigurationProvider.java:78)] Configuration provider started 2013-09-04 07:55:22,640 (conf-file-poller-0) [DEBUG - org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:126)] Checking file:./conf/flume.conf for changes 2013-09-04 07:55:22,642 (conf-file-poller-0) [INFO - org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:133)] Reloading configuration file:./conf/flume.conf 2013-09-04 07:55:22,648 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1016)] Processing:hdfs-sink 2013-09-04 07:55:22,648 (conf-file-poller-0) [DEBUG - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1020)] Created context for hdfs-sink: hdfs.fileType 2013-09-04 07:55:22,649 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1016)] Processing:loggerSink 2013-09-04 07:55:22,650 (conf-file-poller-0) [DEBUG - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1020)] Created context for loggerSink: type 2013-09-04 07:55:22,650 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1016)] Processing:hdfs-sink 2013-09-04 07:55:22,650 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1016)] Processing:hdfs-sink 2013-09-04 07:55:22,650 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1016)] Processing:hdfs-sink 2013-09-04 07:55:22,650 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1016)] Processing:hdfs-sink 2013-09-04 07:55:22,651 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1016)] Processing:hdfs-sink 2013-09-04 07:55:22,651 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1016)] Processing:log-sink1 2013-09-04 07:55:22,651 (conf-file-poller-0) [DEBUG - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1020)] Created context for log-sink1: type 2013-09-04 07:55:22,651 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:930)] Added sinks: loggerSink Agent: agent 2013-09-04 07:55:22,654 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:930)] Added sinks: log-sink1 hdfs-sink Agent: agent1 2013-09-04 07:55:22,654 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1016)] Processing:hdfs-sink 2013-09-04 07:55:22,654 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1016)] Processing:hdfs-sink 2013-09-04 07:55:22,654 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1016)] Processing:loggerSink 2013-09-04 07:55:22,654 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1016)] Processing:hdfs-sink 2013-09-04 07:55:22,655 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1016)] Processing:log-sink1 2013-09-04 07:55:22,655 (conf-file-poller-0) [DEBUG - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.isValid(FlumeConfiguration.java:313)] Starting validation of configuration for agent: agent, initial-configuration: AgentConfiguration[agent] SOURCES: {seqGenSrc={ parameters:{channels=memoryChannel, type=seq} }} CHANNELS: {memoryChannel={ parameters:{capacity=100, type=memory} }} SINKS: {loggerSink={ parameters:{type=logger, channel=memoryChannel} }} 2013-09-04 07:55:22,661 (conf-file-poller-0) [DEBUG - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.validateChannels(FlumeConfiguration.java:468)] Created channel memoryChannel 2013-09-04 07:55:22,671 (conf-file-poller-0) [DEBUG - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.validateSinks(FlumeConfiguration.java:674)] Creating sink: loggerSink using LOGGER 2013-09-04 07:55:22,673 (conf-file-poller-0) [DEBUG - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.isValid(FlumeConfiguration.java:371)] Post validation configuration for agent AgentConfiguration created without Configuration stubs for which only basic syntactical validation was performed[agent] SOURCES: {seqGenSrc={ parameters:{channels=memoryChannel, type=seq} }} CHANNELS: {memoryChannel={ parameters:{capacity=100, type=memory} }} AgentConfiguration created with Configuration stubs for which full validation was performed[agent] SINKS: {loggerSink=ComponentConfiguration[loggerSink] CONFIG: CHANNEL:memoryChannel } 2013-09-04 07:55:22,673 (conf-file-poller-0) [DEBUG - org.apache.flume.conf.FlumeConfiguration.validateConfiguration(FlumeConfiguration.java:135)] Channels:memoryChannel 2013-09-04 07:55:22,673 (conf-file-poller-0) [DEBUG - org.apache.flume.conf.FlumeConfiguration.validateConfiguration(FlumeConfiguration.java:136)] Sinks loggerSink 2013-09-04 07:55:22,674 (conf-file-poller-0) [DEBUG - org.apache.flume.conf.FlumeConfiguration.validateConfiguration(FlumeConfiguration.java:137)] Sources seqGenSrc 2013-09-04 07:55:22,674 (conf-file-poller-0) [DEBUG - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.isValid(FlumeConfiguration.java:313)] Starting validation of configuration for agent: agent1, initial-configuration: AgentConfiguration[agent1] SOURCES: {avro-source1={ parameters:{port=41414, channels=ch1, type=avro, bind=0.0.0.0} }} CHANNELS: {ch1={ parameters:{type=memory} }} SINKS: {hdfs-sink={ parameters:{hdfs.fileType=DataStream, hdfs.path=hdfs://localhost:9000/flume/flumehdfs/, hdfs.batchSize=1000, hdfs.rollInterval=600, hdfs.rollSize=0, hdfs.writeFormat=Text, type=hdfs, hdfs.rollCount=10000, channel=ch1} }, log-sink1={ parameters:{type=logger, channel=ch1} }} 2013-09-04 07:55:22,675 (conf-file-poller-0) [DEBUG - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.validateChannels(FlumeConfiguration.java:468)] Created channel ch1 2013-09-04 07:55:22,677 (conf-file-poller-0) [DEBUG - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.validateSinks(FlumeConfiguration.java:674)] Creating sink: hdfs-sink using HDFS 2013-09-04 07:55:22,678 (conf-file-poller-0) [DEBUG - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.validateSinks(FlumeConfiguration.java:674)] Creating sink: log-sink1 using LOGGER 2013-09-04 07:55:22,679 (conf-file-poller-0) [DEBUG - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.isValid(FlumeConfiguration.java:371)] Post validation configuration for agent1 AgentConfiguration created without Configuration stubs for which only basic syntactical validation was performed[agent1] SOURCES: {avro-source1={ parameters:{port=41414, channels=ch1, type=avro, bind=0.0.0.0} }} CHANNELS: {ch1={ parameters:{type=memory} }} SINKS: {hdfs-sink={ parameters:{hdfs.fileType=DataStream, hdfs.path=hdfs://localhost:9000/flume/flumehdfs/, hdfs.batchSize=1000, hdfs.rollInterval=600, hdfs.rollSize=0, hdfs.writeFormat=Text, type=hdfs, hdfs.rollCount=10000, channel=ch1} }} AgentConfiguration created with Configuration stubs for which full validation was performed[agent1] SINKS: {log-sink1=ComponentConfiguration[log-sink1] CONFIG: CHANNEL:ch1 } 2013-09-04 07:55:22,679 (conf-file-poller-0) [DEBUG - org.apache.flume.conf.FlumeConfiguration.validateConfiguration(FlumeConfiguration.java:135)] Channels:ch1 2013-09-04 07:55:22,679 (conf-file-poller-0) [DEBUG - org.apache.flume.conf.FlumeConfiguration.validateConfiguration(FlumeConfiguration.java:136)] Sinks hdfs-sink log-sink1 2013-09-04 07:55:22,679 (conf-file-poller-0) [DEBUG - org.apache.flume.conf.FlumeConfiguration.validateConfiguration(FlumeConfiguration.java:137)] Sources avro-source1 2013-09-04 07:55:22,680 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration.validateConfiguration(FlumeConfiguration.java:140)] Post-validation flume configuration contains configuration for agents: [agent, agent1] 2013-09-04 07:55:22,680 (conf-file-poller-0) [INFO - org.apache.flume.node.AbstractConfigurationProvider.loadChannels(AbstractConfigurationProvider.java:150)] Creating channels 2013-09-04 07:55:22,691 (conf-file-poller-0) [INFO - org.apache.flume.channel.DefaultChannelFactory.create(DefaultChannelFactory.java:40)] Creating instance of channel ch1 type memory 2013-09-04 07:55:22,699 (conf-file-poller-0) [INFO - org.apache.flume.node.AbstractConfigurationProvider.loadChannels(AbstractConfigurationProvider.java:205)] Created channel ch1 2013-09-04 07:55:22,700 (conf-file-poller-0) [INFO - org.apache.flume.source.DefaultSourceFactory.create(DefaultSourceFactory.java:39)] Creating instance of source avro-source1, type avro 2013-09-04 07:55:22,733 (conf-file-poller-0) [INFO - org.apache.flume.sink.DefaultSinkFactory.create(DefaultSinkFactory.java:40)] Creating instance of sink: log-sink1, type: logger 2013-09-04 07:55:22,736 (conf-file-poller-0) [INFO - org.apache.flume.sink.DefaultSinkFactory.create(DefaultSinkFactory.java:40)] Creating instance of sink: hdfs-sink, type: hdfs 2013-09-04 07:55:22,985 (conf-file-poller-0) [INFO - org.apache.flume.sink.hdfs.HDFSEventSink.authenticate(HDFSEventSink.java:493)] Hadoop Security enabled: false 2013-09-04 07:55:22,989 (conf-file-poller-0) [INFO - org.apache.flume.node.AbstractConfigurationProvider.getConfiguration(AbstractConfigurationProvider.java:119)] Channel ch1 connected to [avro-source1, log-sink1, hdfs-sink] 2013-09-04 07:55:22,996 (conf-file-poller-0) [INFO - org.apache.flume.node.Application.startAllComponents(Application.java:138)] Starting new configuration:{ sourceRunners:{avro-source1=EventDrivenSourceRunner: { source:Avro source avro-source1: { bindAddress: 0.0.0.0, port: 41414 } }} sinkRunners:{hdfs-sink=SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@709446e4 counterGroup:{ name:null counters:{} } }, log-sink1=SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@16ba5c7a counterGroup:{ name:null counters:{} } }} channels:{ch1=org.apache.flume.channel.MemoryChannel{name: ch1}} } 2013-09-04 07:55:23,011 (conf-file-poller-0) [INFO - org.apache.flume.node.Application.startAllComponents(Application.java:145)] Starting Channel ch1 2013-09-04 07:55:23,064 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.register(MonitoredCounterGroup.java:110)] Monitoried counter group for type: CHANNEL, name: ch1, registered successfully. 2013-09-04 07:55:23,064 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.start(MonitoredCounterGroup.java:94)] Component type: CHANNEL, name: ch1 started 2013-09-04 07:55:23,065 (conf-file-poller-0) [INFO - org.apache.flume.node.Application.startAllComponents(Application.java:173)] Starting Sink hdfs-sink 2013-09-04 07:55:23,066 (conf-file-poller-0) [INFO - org.apache.flume.node.Application.startAllComponents(Application.java:173)] Starting Sink log-sink1 2013-09-04 07:55:23,068 (conf-file-poller-0) [INFO - org.apache.flume.node.Application.startAllComponents(Application.java:184)] Starting Source avro-source1 2013-09-04 07:55:23,069 (lifecycleSupervisor-1-3) [INFO - org.apache.flume.source.AvroSource.start(AvroSource.java:192)] Starting Avro source avro-source1: { bindAddress: 0.0.0.0, port: 41414 }... 2013-09-04 07:55:23,069 (lifecycleSupervisor-1-1) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.register(MonitoredCounterGroup.java:110)] Monitoried counter group for type: SINK, name: hdfs-sink, registered successfully. 2013-09-04 07:55:23,069 (lifecycleSupervisor-1-1) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.start(MonitoredCounterGroup.java:94)] Component type: SINK, name: hdfs-sink started 2013-09-04 07:55:23,078 (SinkRunner-PollingRunner-DefaultSinkProcessor) [DEBUG - org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:143)] Polling sink runner starting 2013-09-04 07:55:23,079 (SinkRunner-PollingRunner-DefaultSinkProcessor) [DEBUG - org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:143)] Polling sink runner starting 2013-09-04 07:55:23,458 (lifecycleSupervisor-1-3) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.register(MonitoredCounterGroup.java:110)] Monitoried counter group for type: SOURCE, name: avro-source1, registered successfully. 2013-09-04 07:55:23,462 (lifecycleSupervisor-1-3) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.start(MonitoredCounterGroup.java:94)] Component type: SOURCE, name: avro-source1 started 2013-09-04 07:55:23,464 (lifecycleSupervisor-1-3) [INFO - org.apache.flume.source.AvroSource.start(AvroSource.java:217)] Avro source avro-source1 started.
但是当一些事件发生时,下面的错误出现在水槽日志中,并且没有任何东西被写入hdfs。
ERROR - org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:422)] process failed java.lang.NoSuchMethodError: org.apache.hadoop.util.Shell.getGROUPS_COMMAND()[Ljava/lang/String; at org.apache.hadoop.security.UnixUserGroupInformation.getUnixGroups(UnixUserGroupInformation.java:345) at org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupInformation.java:264) at org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupInformation.java:300) at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:192) at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:170) at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:82) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1792) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:76) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:1826) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1808) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:265) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:190) at org.apache.flume.sink.hdfs.BucketWriter$1.call(BucketWriter.java:226) at org.apache.flume.sink.hdfs.BucketWriter$1.call(BucketWriter.java:220) at org.apache.flume.sink.hdfs.BucketWriter$8$1.run(BucketWriter.java:536) at org.apache.flume.sink.hdfs.BucketWriter.runPrivileged(BucketWriter.java:160) at org.apache.flume.sink.hdfs.BucketWriter.access$1000(BucketWriter.java:56) at org.apache.flume.sink.hdfs.BucketWriter$8.call(BucketWriter.java:533) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:679)
我缺少一些配置或jar文件?
我最近正在对一个类似的(虽然不是完全相同的)问题进行故障排除,我想起了两个可能对您有所帮助的解决方案。
首先,您运行的是一个看起来非常旧的Hadoop版本。它目前是(稳定的)版本1.2.1,0.20.2于2010年2月发布。我记得我遇到过一个线程,其中一个用户在0.2x.x上遇到了类似的问题,建议他升级。
通过安装正确的java版本,我的问题最终得到了解决。我认为JDK1.6或更高版本是必需的(至少对于更新版本的Hadoop是如此)。对于CentOS,“yum安装java-1.6.0-openjdk-devel”完美地修复了我的问题。
很抱歉我现在不能提供更多。我会试着找到我之前读过的相关文章,然后再回复,但同时也许这会给你一个开始的地方。如果没有别的,请回复您的java版本,也许这将有助于进一步的故障排除。
我正在尝试使用hdfs水槽运行水槽。hdfs在不同的机器上正常运行,我甚至可以与水槽机器上的hdfs交互,但是当我运行水槽并向其发送事件时,我收到以下错误: 同样,一致性不是问题,因为我可以使用hadoop命令行与hdfs交互(水槽机不是datanode)。最奇怪的是,在杀死水槽后,我可以看到tmp文件是在hdfs中创建的,但它是空的(扩展名仍然是. tmp)。 关于为什么会发生这种情况的任何想法
我想使用 flume 将数据从 hdfs 目录传输到 hdfs 中的目录,在此传输中,我想应用处理形态线。 例如:我的来源是 我的水槽是 有水槽可能吗? 如果是,源水槽的类型是什么?
我遇到了Flume的问题(Cloudera CDH 5.3上的1.5): 我想做的是:每5分钟,大约20个文件被推送到假脱机目录(从远程存储中抓取)。每个文件包含多行,每行是一个日志(在JSON中)。文件大小在10KB到1MB之间。 当我启动代理时,所有文件都被成功推送到HDFS。1分钟后(这是我在flume.conf中设置的),文件被滚动(删除. tmp后缀并关闭)。 但是,当在假脱机目录中找到
我们可以为HDFS Sink添加分隔符吗?写入文件时,我们如何添加记录分隔符? 以下是配置:-
我试图将FLUME与HDFS集成,我的FLUME配置文件是 我的核心站点文件是 当我尝试运行flume代理时,它正在启动,并且能够从nc命令中读取,但是在写入hdfs时,我得到了下面的异常。我尝试使用< code > Hadoop DFS admin-safe mode leave 在安全模式下启动,但仍然出现以下异常。 如果在任何属性文件中配置了错误,请告诉我,以便它可以工作。 另外,如果我为此
https://cwiki.apache.org/confluence/display/FLUME/Getting 开始的页面说 HDFS sink 支持追加,但我无法找到有关如何启用它的任何信息,每个示例都在滚动文件上。因此,如果可能的话,我将不胜感激有关如何将水槽附加到现有文件的任何信息) 使现代化 可以将所有滚动属性设置为0,这将使flume写入单个文件,但它不会关闭文件,新记录对其他进程不