问题：

如何通过Java代码更改Apache水槽的配置文件？

蒋曾笑

2023-03-14

我目前正在进行一个大数据项目，对推特的热门话题进行情绪分析。我学习了cloudera的教程，了解了如何通过flume向Hadoop发送推文。

http://blog . cloud era . com/blog/2012/09/analyzing-Twitter-data-with-Hadoop/

flume.conf：

# Licensed to the Apache Software Foundation (ASF) under one

# or more contributor license agreements. See the NOTICE file

# distributed with this work for additional information

# regarding copyright ownership. The ASF licenses this file

# to you under the Apache License, Version 2.0 (the

# "License"); you may not use this file except in compliance

# with the License. You may obtain a copy of the License at

#

# http://www.apache.org/licenses/LICENSE-2.0

#

# Unless required by applicable law or agreed to in writing,

# software distributed under the License is distributed on an

# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY

# KIND, either express or implied. See the License for the

# specific language governing permissions and limitations

# under the License.



# The configuration file needs to define the sources, 

# the channels and the sinks.

# Sources, channels and sinks are defined per agent, 

# in this case called 'TwitterAgent'


TwitterAgent.sources = Twitter

TwitterAgent.channels = MemChannel

TwitterAgent.sinks = HDFS


TwitterAgent.sources.Twitter.type = com.cloudera.flume.source.TwitterSource

TwitterAgent.sources.Twitter.channels = MemChannel

TwitterAgent.sources.Twitter.consumerKey = 

TwitterAgent.sources.Twitter.consumerSecret = 

TwitterAgent.sources.Twitter.accessToken =  

TwitterAgent.sources.Twitter.accessTokenSecret =  

TwitterAgent.sources.Twitter.keywords = hadoop, big data, analytics, bigdata, cloudera, data science, data scientiest, business intelligence, mapreduce, data warehouse, data warehousing, mahout, hbase, nosql, newsql, businessintelligence, cloudcomputing


TwitterAgent.sinks.HDFS.channel = MemChannel

TwitterAgent.sinks.HDFS.type = hdfs

TwitterAgent.sinks.HDFS.hdfs.path = hdfs://hadoop1:8020/user/flume/tweets/%Y/%m/%d/%H/

TwitterAgent.sinks.HDFS.hdfs.fileType = DataStream

TwitterAgent.sinks.HDFS.hdfs.writeFormat = Text

TwitterAgent.sinks.HDFS.hdfs.batchSize = 1000

TwitterAgent.sinks.HDFS.hdfs.rollSize = 0

TwitterAgent.sinks.HDFS.hdfs.rollCount = 10000


TwitterAgent.channels.MemChannel.type = memory

TwitterAgent.channels.MemChannel.capacity = 10000

TwitterAgent.channels.MemChannel.transactionCapacity = 100

现在为了将其扩展到我的应用程序，我需要 flume 配置文件中的关键字部分来包含趋势主题，我想出了 Java 代码来获取趋势主题，但现在我有一个问题我不知道，如何将此代码连接到 flume 配置文件或如何制作一个新文件，并在关键字部分添加实时趋势主题。我在网上搜索了很多，因为我是这个领域的初学者，如果你提供一些信息或至少其他一些替代方案，这将有很大的帮助。

共有1个答案

蒋寒

2023-03-14

一个非常有趣的问题..！

我同意@cricket_007的评论——在不重新启动Flume代理的情况下编辑配置是不可能的。

我不能说太多，因为我还没有看到你的java代码来获取热门主题的关键字。但是，根据您提供的信息，我可以想到一种替代方案（或者我应该说一种解决方法）-但我自己还没有尝试过。

你可以像这样修改 TwitterSource.java 类：

public void configure(Context context) {
consumerKey = context.getString(TwitterSourceConstants.CONSUMER_KEY_KEY);
consumerSecret = context.getString(TwitterSourceConstants.CONSUMER_SECRET_KEY);
accessToken = context.getString(TwitterSourceConstants.ACCESS_TOKEN_KEY);
accessTokenSecret = context.getString(TwitterSourceConstants.ACCESS_TOKEN_SECRET_KEY);

//MODIFY THE FOLLOWING PORTION
String keywordString = context.getString(TwitterSourceConstants.KEYWORDS_KEY, "");
if (keywordString.trim().length() == 0) {
    keywords = new String[0];
} else {
  keywords = keywordString.split(",");
  for (int i = 0; i < keywords.length; i++) {
    keywords[i] = keywords[i].trim();
  }
}
//UNTIL THIS POINT

ConfigurationBuilder cb = new ConfigurationBuilder();
cb.setOAuthConsumerKey(consumerKey);
cb.setOAuthConsumerSecret(consumerSecret);
cb.setOAuthAccessToken(accessToken);
cb.setOAuthAccessTokenSecret(accessTokenSecret);
cb.setJSONStoreEnabled(true);
cb.setIncludeEntitiesEnabled(true);

twitterStream = new TwitterStreamFactory(cb.build()).getInstance(); 
}

我已经在上面的注释中，您正在初始化keywordString变量-您可以调用java代码（我假设这是一个方法，您可以从中返回逗号分隔的关键字字符串），而不是从水槽中的上下文中提取。conf（只需移除context.getString（）部分）。

除此之外，只需从flume.conf中删除以下语句：

TwitterAgent.sources.Twitter.keywords = hadoop, big data, analytics, bigdata, cloudera, data science, data scientiest, business intelligence, mapreduce, data warehouse, data warehousing, mahout, hbase, nosql, newsql, businessintelligence, cloudcomputing

我希望这有所帮助。

类似资料：

如何通过Java代码动态更改fontFamily

根据 https://stackoverflow.com/a/13329907/3286489，我们可以更改 TextView 的字体系列。但是我如何使用 Java 代码进行设置呢？我找不到 API。
水槽配置file_roll

我对Flume和/或log4j有问题。我有带log4j和flume appender的JavaEE应用程序=一切都很好，工作正常。当我用下面的配置运行水槽时，一切都很好。这里是flume配置文件如果我将接收器从 hdfs 更改为 file_roll，则水槽代理会创建日志文件，但没有任何内容。这是来自水槽的新 conf-file。我的错误是什么？
如何通过代码动态更改主题

我已经为我的应用程序实现了材质设计，我希望用户能够自定义colorPrimary、colorPrimaryDark和colorAccent的颜色。怎么做？我想有一个设置活动，用户可以选择自己的颜色，更改将应用于我的所有活动。谢谢
Apache Flume Hdfs水槽

我们可以为HDFS Sink添加分隔符吗？写入文件时，我们如何添加记录分隔符？以下是配置：-
Java。HSSF。Apache-poi。如何修改代码

问题内容：我的数据以以下格式存储（向下看）：[-]表示空白单元格，右边可能只有10列（空格后）。像这样的东西： [string0] [-] [string1] [string2] [string3] .. [string10] [-] 如何为以下代码更改此代码： 1）仅获取[string0] 2）仅获取[string1] [string2] [string3] .. [string10] [-]
通过代码更改Vaadin7中的主题

我在Vaadin7做一个项目。因为我需要改变一个页面的主题。但在Vaadin 7里，我找不到这样的。我知道会有办法做到的。以及当我更改主题时如何在UI上应用更改？

如何通过Java代码更改Apache水槽的配置文件？

共有1个答案

相关问答

相关文章

相关阅读

相关工具

相关文档