当前位置: 首页 > 知识库问答 >
问题:

运行wordcount示例时的Apache Beam异常

濮阳宏硕
2023-03-14

我想我在文档上遵循了非常多的步骤,但我仍然遇到了这个异常。(唯一的不同是我从Eclipse J2EE运行它,但我不会期望这真的很重要,不是吗?)

代码:(这不是我写的,它来自梁项目示例)。我认为您必须指定一个google云平台项目,并提供访问该项目的正确凭据。然而,在这个示例项目中,我没有找到进行设置的地方。

  public static void main(String[] args) {
// Create a PipelineOptions object. This object lets us set various execution
// options for our pipeline, such as the runner you wish to use. This example
// will run with the DirectRunner by default, based on the class path configured
// in its dependencies.
PipelineOptions options = PipelineOptionsFactory.create();

// Create the Pipeline object with the options we defined above.
Pipeline p = Pipeline.create(options);

// Apply the pipeline's transforms.

// Concept #1: Apply a root transform to the pipeline; in this case, TextIO.Read to read a set
// of input text files. TextIO.Read returns a PCollection where each element is one line from
// the input text (a set of Shakespeare's texts).

// This example reads a public data set consisting of the complete works of Shakespeare.
p.apply(TextIO.Read.from("gs://apache-beam-samples/shakespeare/*"))
.....
)

例外情况:

Exception in thread "main" java.lang.IllegalStateException: Failed to validate gs://apache-beam-samples/shakespeare/*
at org.apache.beam.sdk.io.TextIO$Read$Bound.expand(TextIO.java:309)
at org.apache.beam.sdk.io.TextIO$Read$Bound.expand(TextIO.java:205)
at org.apache.beam.sdk.runners.PipelineRunner.apply(PipelineRunner.java:76)
at org.apache.beam.runners.direct.DirectRunner.apply(DirectRunner.java:296)
at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:388)
at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:302)
at org.apache.beam.sdk.values.PBegin.apply(PBegin.java:47)
at org.apache.beam.sdk.Pipeline.apply(Pipeline.java:152)
at google.dataflow.beam.example.MinimalWordCount.main(MinimalWordCount.java:77)
Caused by: java.io.IOException: Unable to match files in bucket apache-beam-samples, prefix shakespeare/ against pattern shakespeare/[^/]*
at org.apache.beam.sdk.util.GcsUtil.expand(GcsUtil.java:234)
at org.apache.beam.sdk.util.GcsIOChannelFactory.match(GcsIOChannelFactory.java:53)
at org.apache.beam.sdk.io.TextIO$Read$Bound.expand(TextIO.java:304)
... 8 more
Caused by: com.google.api.client.http.HttpResponseException: 400 Bad Request
{


"error" : "invalid_grant"
}
    at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:1070)
    at com.google.auth.oauth2.UserCredentials.refreshAccessToken(UserCredentials.java:207)
    at com.google.auth.oauth2.OAuth2Credentials.refresh(OAuth2Credentials.java:149)
    at com.google.auth.oauth2.OAuth2Credentials.getRequestMetadata(OAuth2Credentials.java:135)
    at com.google.auth.http.HttpCredentialsAdapter.initialize(HttpCredentialsAdapter.java:96)
    at com.google.cloud.hadoop.util.ChainingHttpRequestInitializer.initialize(ChainingHttpRequestInitializer.java:52)
    at com.google.api.client.http.HttpRequestFactory.buildRequest(HttpRequestFactory.java:93)
    at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.buildHttpRequest(AbstractGoogleClientRequest.java:300)
    at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:419)
    at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:352)
    at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:469)
    at com.google.cloud.hadoop.util.ResilientOperation$AbstractGoogleClientRequestExecutor.call(ResilientOperation.java:166)
    at com.google.cloud.hadoop.util.ResilientOperation.retry(ResilientOperation.java:66)
    at com.google.cloud.hadoop.util.ResilientOperation.retry(ResilientOperation.java:103)
    at org.apache.beam.sdk.util.GcsUtil.expand(GcsUtil.java:227)
    ... 10 more

共有1个答案

羊舌兴德
2023-03-14

如果使用Windows,请尝试从命令提示符运行它。转到包含pom.xml文件的文件夹并在其中打开cmd。然后给出带有相应参数的命令。

mvn compile exec:java -Dexec.mainClass=org.apache.beam.examples.WordCount -Dexec.args=" --output=counts" -Pdirect-runner

如果要使用输入文件运行。然后用任何名称创建一个txt文件,并将其放入包含pom的文件夹中。然后按照命令开火。

mvn compile exec:java -Dexec.mainClass=org.apache.beam.examples.WordCount -Dexec.args="--inputFile=YOURFILENAME.txt --output=counts" -Pdirect-runner**

希望这样可以。我正在调查你的问题

 类似资料:
  • 我正在尝试运行WordCount Map/Reduce作业的示例代码。我正在Hadoop1.2.1上运行它。我用我的日食来运行它。下面是我尝试运行的代码: 13/11/04 13:27:53 INFO Mapred.JobClient:任务Id:Attitt_201310311611_0005_M_000000_0,状态:失败java.lang.RuntimeException:java.lang

  • 我试图在Hadoop 1.0.4和Ubuntu 12.04上用C++运行wordcount示例,但我得到以下错误: 错误消息: 13/06/14 13:50:11警告Mapred.JobClient:未设置作业jar文件。可能找不到用户类。请参阅JobConf(Class)或JobConf#setjar(String)。13/06/14 13:50:11 INFO util.NativEcodeL

  • 我正在VMware中Ubuntu12.04的单节点环境中运行hadoop wordcount示例。我运行的示例是这样的:-- 当我运行wordcount程序时,我得到以下错误:--

  • 我试图在AWS EMR上运行字数计数示例,但是我很难在集群上部署和运行jar。它是一个自定义的字数示例,其中我使用了一些JSON解析。输入在我的S3桶中。当我试图在EMR集群上运行我的工作时,我得到的错误是在我的Mapper类中找不到主函数。在互联网上的任何地方,字数计数示例映射减少作业的代码就像他们创建的,三个类,一个扩展Mapper的静态映射器类,然后是扩展减少器的减少器,然后是包含作业配置的