当前位置: 首页 > 知识库问答 >
问题:

在Azure上使用Blob存储配置hadoop

孟海
2023-03-14

我们正在Linux上运行的AzureV2云上测试hadoop HA集群。我们正在尝试切换到Azure BLOB存储。我们不确定应该如何使用Blob存储配置名称节点。我们收到以下错误:

2015-12-22 13:05:50,193 INFO  ha.StandbyCheckpointer (StandbyCheckpointer.java:start(129)) - Starting standby checkpoint thread...
Checkpointing active NN at http://bd-azure-qa-nn2:50070
Serving checkpoints at http://bd-azure-qa-nn1:50070
2015-12-22 13:07:50,240 INFO  ha.EditLogTailer (EditLogTailer.java:triggerActiveLogRoll(269)) - Triggering log roll on remote NameNode bd-azure-qa-nn2/10.0.0.7:8020
2015-12-22 13:07:51,387 INFO  ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to server: bd-azure-qa-nn2/10.0.0.7:8020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
2015-12-22 13:07:52,391 INFO  ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to server: bd-azure-qa-nn2/10.0.0.7:8020. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
2015-12-22 13:07:53,400 INFO  ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to server: bd-azure-qa-nn2/10.0.0.7:8020. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
2015-12-22 13:07:54,416 INFO  ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to server: bd-azure-qa-nn2/10.0.0.7:8020. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
2015-12-22 13:07:55,425 INFO  ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to server: bd-azure-qa-nn2/10.0.0.7:8020. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
2015-12-22 13:07:56,450 INFO  ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to server: bd-azure-qa-nn2/10.0.0.7:8020. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
2015-12-22 13:07:57,456 INFO  ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to server: bd-azure-qa-nn2/10.0.0.7:8020. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
2015-12-22 13:07:58,462 INFO  ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to server: bd-azure-qa-nn2/10.0.0.7:8020. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
2015-12-22 13:07:59,473 INFO  ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to server: bd-azure-qa-nn2/10.0.0.7:8020. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
2015-12-22 13:08:00,478 INFO  ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to server: bd-azure-qa-nn2/10.0.0.7:8020. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
2015-12-22 13:08:01,482 INFO  ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to server: bd-azure-qa-nn2/10.0.0.7:8020. Already tried 10 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
2015-12-22 13:08:02,490 INFO  ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to server: bd-azure-qa-nn2/10.0.0.7:8020. Already tried 11 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
2015-12-22 13:08:03,501 INFO  ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to server: bd-azure-qa-nn2/10.0.0.7:8020. Already tried 12 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
2015-12-22 13:08:04,515 INFO  ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to server: bd-azure-qa-nn2/10.0.0.7:8020. Already tried 13 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
2015-12-22 13:08:05,520 INFO  ipc.Client (Client.java:handleConnectionFailure(858)) - Retrying connect to server: bd-azure-qa-nn2/10.0.0.7:8020. Already tried 14 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
2015-12-22 13:08:05,966 WARN  ha.EditLogTailer (EditLogTailer.java:triggerActiveLogRoll(274)) - Unable to trigger a roll of the active NN
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category JOURNAL is not supported in state standby
    at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:87)
    at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1719)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1352)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.rollEditLog(FSNamesystem.java:6339)
    at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.rollEditLog(NameNodeRpcServer.java:933)
    at org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolServerSideTranslatorPB.rollEditLog(NamenodeProtocolServerSideTranslatorPB.java:139)
    at org.apache.hadoop.hdfs.protocol.proto.NamenodeProtocolProtos$NamenodeProtocolService$2.callBlockingMethod(NamenodeProtocolProtos.java:11214)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)

    at org.apache.hadoop.ipc.Client.call(Client.java:1468)
    at org.apache.hadoop.ipc.Client.call(Client.java:1399)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
    at com.sun.proxy.$Proxy14.rollEditLog(Unknown Source)
    at org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolTranslatorPB.rollEditLog(NamenodeProtocolTranslatorPB.java:145)
    at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.triggerActiveLogRoll(EditLogTailer.java:271)
    at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.access$600(EditLogTailer.java:61)
    at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:313)
    at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$200(EditLogTailer.java:282)
    at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:299)
    at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:412)
    at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:295)
   <property>
      <name>fs.AbstractFileSystem.wasb.impl</name>
      <value>org.apache.hadoop.fs.azure.Wasb</value>
    </property>

    <property>
      <name>fs.azure.account.key.OUR_STORAGE_ACCOUNT.blob.core.windows.net</name>
      <value>"OUR_KEY"</value>
    </property>

    <property>
      <name>fs.defaultFS</name>
      <value>wasb://blob-hdfs@OUR_STORAGE_ACCOUNT.blob.core.windows.net</value>
      <final>true</final>
    </property>

    <property>
      <name>fs.azure.page.blob.dir</name>
      <value>/datadir</value>
    </property>

    <property>
      <name>fs.azure.selfthrottling.read.factor</name>
      <value>1.000000</value>
    </property>

    <property>
      <name>fs.azure.selfthrottling.write.factor</name>
      <value>1.000000</value>
   <property>
<!--    <property>
    <name>fs.defaultFS</name>
    <value>hdfs://namenodeha</value>
  </property> -->

我们不确定名称节点设置。原始设置中的两个name节点可能会过度使用,因为基础BLOB应该处理所有复制等。

谁能澄清一下吗?

共有1个答案

茅涵映
2023-03-14

以下属性是我们在oreder中的hdfs-site.xml中缺少的。

  <property>
     <name>dfs.datanode.https.address</name>
     <value>0.0.0.0:50475</value>
  </property>

  <property>
    <name>dfs.namenode.https-address.namenodeha.nn1</name>
    <value>bd-azure-qa-nn1:50470</value>
  </property>

  <property>
    <name>dfs.namenode.https-address.namenodeha.nn2</name>
    <value>bd-azure-qa-nn2:50470</value>
  </property>


    <property>
    <name>dfs.journalnode.https-address</name>
    <value>0.0.0.0:8481</value>                                 :
  </property>

当然可以使用适当的主机名。

 类似资料:
  • 我在尝试将文件上载到blob存储时遇到此错误。在本地主机上运行和在Azure函数中运行时都会出现错误。 我的连接字符串如下所示:DefaultEndpoint sProtocol=https; AcCountName=xxx; AcCountKey=xxx; Endpoint Suffix=core.windows.net 身份验证信息的格式不正确。检查授权标头的值。时间:2021 10月14日1

  • 我有一个用例,需要以Json格式将调查结果从web应用程序上传到azure blob存储。根据调查问题判断,这些json对象将很小,甚至不会接近1MB。我一直在阅读C#中的azure blob客户端并进行实验。我实现了一个工作单元和存储库设计模式,这意味着每个CRUD操作都会导致与azure存储的连接。我是否应该考虑并行操作或批量调用以降低成本,提高性能和吞吐量?有很多关于并行操作的文章,但他们试

  • 在v11SDK. NET中,我能够使用托管标识令牌来访问Azure blob: 现在我想切换到v12 SDK,但不知道如何对BlobServiceClient进行同样的操作。

  • 我可以使用以下C代码将文件上载到azure blob存储, 这里我提供blob存储连接字符串和容器名称。 现在,我可以看到我在下有URL, 问题,我可以编写C代码,使用上面的URL上传文件,而不是使用连接字符串/容器名称吗?

  • 我想使用Python中的Azure函数将JSON数据作为. json文件上传到Azure存储Blob。 因为我使用的是Azure函数,而不是实际的服务器,所以我不想(也可能无法)在本地内存中创建一个临时文件,并使用Azure blob存储客户端库v2将该文件上载到Azure blob存储。1对于Python(这里有参考链接)。因此,我想为Azure函数使用输出blob存储绑定(这里有参考链接)。