我使用的是Ubuntu 14.04,我的配置文件如下:
TwitterAgent.sources = Twitter
TwitterAgent.channels = MemChannel
TwitterAgent.sinks = HDFS
TwitterAgent.sources.Twitter.type = com.cloudera.flume.source.TwitterSource
TwitterAgent.sources.Twitter.channels = MemChannel
TwitterAgent.sources.Twitter.consumerKey = Q5JF4gVmrahNk93C913GjgJgB
TwitterAgent.sources.Twitter.consumerSecret = GFM6F0QuqEHn1eKpL1k4CHwdecEp626xLepajp9CAbtRBxEVCC
TwitterAgent.sources.Twitter.accessToken = 152956374-hTFXO9g1RBSn1yikmi2mQClilZe2PqnyqphFQh9t
TwitterAgent.sources.Twitter.accessTokenSecret = SODGEbkQvHYzZMtPsWoI2k9ZKiAd7q21ebtG3SNMu3Y0a
TwitterAgent.sources.Twitter.keywords = hadoop, big data, analytics, bigdata, cloudera, data science, data scientiest, business intelligence, mapreduce, data warehouse, data warehousing, mahout, hbase, nosql, newsql, businessintelligence, cloudcomputing
TwitterAgent.sinks.HDFS.type = hdfs
TwitterAgent.sinks.HDFS.channel = MemChannel
TwitterAgent.sinks.HDFS.hdfs.path = hdfs://localhost:9000/user/flume/tweets/
TwitterAgent.sinks.HDFS.hdfs.fileType = DataStream
TwitterAgent.sinks.HDFS.hdfs.writeFormat = Text
#number of events written to file before it is flushed to HDFS/default 100
TwitterAgent.sinks.HDFS.hdfs.batchSize = 1000
#File size to trigger roll, in bytes (0: never roll based on file size)
TwitterAgent.sinks.HDFS.hdfs.rollSize = 0
#Number of events written to file before it rolled (0 = never roll based #on number of events)
TwitterAgent.sinks.HDFS.hdfs.rollCount = 10000
TwitterAgent.channels.MemChannel.type = memory
#The maximum number of events stored in the channel
TwitterAgent.channels.MemChannel.capacity = 10000
#The maximum number of events the channel will take from a source or give to a sink per #transaction
TwitterAgent.channels.MemChannel.transactionCapacity = 100
我正在我的终端上使用以下命令:
hadoopuser@Hotshot:/usr/lib/flume-ng/apache-flume-1.4.0-bin/bin$ ./flume-ng agent –conf ./conf/ -f /usr/lib/flume-ng/apache-flume-1.4.0-bin/conf/flume.conf -Dflume.root.logger=DEBUG,console -n TwitterAgent
我收到以下错误:
14/10/10 17:24:12 INFO instrumentation.MonitoredCounterGroup: Component type: SINK, name: HDFS started
14/10/10 17:24:12 INFO twitter4j.TwitterStreamImpl: Establishing connection.
14/10/10 17:24:22 INFO twitter4j.TwitterStreamImpl: Connection established.
14/10/10 17:24:22 INFO twitter4j.TwitterStreamImpl: Receiving status stream.
14/10/10 17:24:22 INFO hdfs.HDFSDataStream: Serializer = TEXT, UseRawLocalFileSystem = false
14/10/10 17:24:22 INFO hdfs.BucketWriter: Creating hdfs://localhost:9000/user/flume/tweets//FlumeData.1412942062375.tmp
14/10/10 17:24:22 ERROR hdfs.HDFSEventSink: process failed
java.lang.VerifyError: class org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$RecoverLeaseRequestProto overrides final method getUnknownFields.()Lcom/google/protobuf/UnknownFieldSet;
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.getDeclaredMethods0(Native Method)
at java.lang.Class.privateGetDeclaredMethods(Class.java:2570)
at java.lang.Class.privateGetPublicMethods(Class.java:2690)
at java.lang.Class.privateGetPublicMethods(Class.java:2700)
at java.lang.Class.getMethods(Class.java:1467)
at sun.misc.ProxyGenerator.generateClassFile(ProxyGenerator.java:426)
at sun.misc.ProxyGenerator.generateProxyClass(ProxyGenerator.java:323)
at java.lang.reflect.Proxy$ProxyClassFactory.apply(Proxy.java:672)
at java.lang.reflect.Proxy$ProxyClassFactory.apply(Proxy.java:592)
at java.lang.reflect.WeakCache$Factory.get(WeakCache.java:244)
at java.lang.reflect.WeakCache.get(WeakCache.java:141)
at java.lang.reflect.Proxy.getProxyClass0(Proxy.java:455)
at java.lang.reflect.Proxy.newProxyInstance(Proxy.java:738)
at org.apache.hadoop.ipc.ProtobufRpcEngine.getProxy(ProtobufRpcEngine.java:92)
at org.apache.hadoop.ipc.RPC.getProtocolProxy(RPC.java:537)
at org.apache.hadoop.hdfs.NameNodeProxies.createNNProxyWithClientProtocol(NameNodeProxies.java:366)
at org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:262)
at org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:153)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:602)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:547)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:139)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2591)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:89)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2625)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2607)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:368)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
at org.apache.flume.sink.hdfs.BucketWriter$1.call(BucketWriter.java:226)
at org.apache.flume.sink.hdfs.BucketWriter$1.call(BucketWriter.java:220)
at org.apache.flume.sink.hdfs.BucketWriter$8$1.run(BucketWriter.java:536)
at org.apache.flume.sink.hdfs.BucketWriter.runPrivileged(BucketWriter.java:160)
at org.apache.flume.sink.hdfs.BucketWriter.access$1000(BucketWriter.java:56)
at org.apache.flume.sink.hdfs.BucketWriter$8.call(BucketWriter.java:533)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Apache Flume和Apache Hadoop的版本有兼容性问题吗?我没有找到任何可以帮助我安装Apache Flume 1 . 5 . 1版的好资源。如果没有兼容性问题,那么我应该如何在我的HDFS中获取tweets?
Hadoop使用的是protobuf 2.5
hadoop-project/pom.xml: <protobuf.version>2.5.0</protobuf.version>
使用 protobuf 2.5 生成的代码与较旧的 protobuf 库二进制不兼容。不幸的是,当前稳定版本的Flume 1.4包含protobuf 2.4.1。您可以通过将protobuf和番石榴移出Flume的lib目录来解决此问题。
我一整天都在不断收到这个日志信息。 2016-10-12 21:32:05,696(conf-file-poller-0)[DEBUG-org . Apache . FLUME . node . pollingpropertiesfileconfigurationprovider $ filewatcherrunnable . run(pollingpropertiesfileconfigurat
问题内容: 我从SQL获取数据并将其放在列表中。这就是我现在正在尝试的 这就是我从sql获取数据的方式, 桌子看起来像这样 aID,bID,名称 ** 问题 ** 我被困在如何将项目添加到列表中,这 也是最佳实践吗? 问题答案:
本文向大家介绍如何获取当前数据库版本?相关面试题,主要包含被问及如何获取当前数据库版本?时的应答技巧和注意事项,需要的朋友参考一下 使用 select version() 获取当前 MySQL 数据库版本。
//MySQL,,mysql -v select version(); //Oracle select * from v$version;
问题内容: 有没有一种方法可以通过Hibernate 3.2 API获得一些基础数据库版本的信息?我在这里和javadoc中都找不到相关的位。 问题答案: 获取数据库引擎的版本是特定于实现的。这意味着没有获取版本的共享方法,因此Hibernate不能真正提供API,因为它没有绑定到任何特定的RDBMS。例如,以下是通过一些著名的RDBMS使用SELECT语句获取版本的几种不同方法: Oracle
亲爱的, 我已经创建了一个junit5测试用例(UserDaoTests.java),但无法在spring配置文件(data.xml)中定义DataSource Bean。 数据junit的xml配置文件。我在其中定义了数据源bean jdbc。属性: JUnit测试(UserDaoTests.java) 日志: 我还想澄清一下,我使用静态数据源,因为我需要使用@BeforeAll,它只需要静态字