我正在尝试使用Beam为Google BigQuery提供的I/O API在本地(Sierra)运行Apache Beam管道。
根据Beam Python quickstart的建议,我使用Virtualenv建立了我的环境,并且我可以运行wordcount。py示例。我还可以使用beam正确运行自定义管道。创建梁。ParDo。
但是我不能使用BigQuery I/O运行管道。知道我做错了什么吗?
python脚本如下。
import apache_beam as beam
from apache_beam.utils.pipeline_options import PipelineOptions
from apache_beam.io import WriteToText
class MyDoFn(beam.DoFn):
def process(self, element):
return element
def run():
opts = {
'project': 'gc-project-name'
}
p = beam.Pipeline(options=PipelineOptions(**opts))
input_query = "SELECT name FROM `gc-project-name.dataset_name.table_name`"
(p
| beam.io.Read(beam.io.BigQuerySource(query=input_query))
| beam.ParDo(MyDoFn())
| beam.io.WriteToText('output.txt')
)
result = p.run()
result.wait_until_finish()
if __name__ == '__main__':
run()
当我运行它时,我得到以下错误。
WARNING:root:Task failed: Traceback (most recent call last):
File "/Users/localuser/Virtualenvs/abeam/lib/python2.7/site-packages/apache_beam/runners/direct/executor.py", line 300, in __call__
result = evaluator.finish_bundle()
File "/Users/localuser/Virtualenvs/abeam/lib/python2.7/site-packages/apache_beam/runners/direct/transform_evaluator.py", line 208, in finish_bundle
with self._source.reader() as reader:
File "/Users/localuser/Virtualenvs/abeam/lib/python2.7/site-packages/apache_beam/io/gcp/bigquery.py", line 590, in __enter__
self.client = BigQueryWrapper(client=self.test_bigquery_client)
File "/Users/localuser/Virtualenvs/abeam/lib/python2.7/site-packages/apache_beam/io/gcp/bigquery.py", line 682, in __init__
self.client = client or bigquery.BigqueryV2(
AttributeError: 'module' object has no attribute 'BigqueryV2'
Traceback (most recent call last):
File "/Users/localuser/Virtualenvs/abeam/lib/python2.7/site-packages/apache_beam/runners/direct/executor.py", line 300, in __call__
result = evaluator.finish_bundle()
File "/Users/localuser/Virtualenvs/abeam/lib/python2.7/site-packages/apache_beam/runners/direct/transform_evaluator.py", line 208, in finish_bundle
with self._source.reader() as reader:
File "/Users/localuser/Virtualenvs/abeam/lib/python2.7/site-packages/apache_beam/io/gcp/bigquery.py", line 590, in __enter__
self.client = BigQueryWrapper(client=self.test_bigquery_client)
File "/Users/localuser/Virtualenvs/abeam/lib/python2.7/site-packages/apache_beam/io/gcp/bigquery.py", line 682, in __init__
self.client = client or bigquery.BigqueryV2(
AttributeError: 'module' object has no attribute 'BigqueryV2'
WARNING:root:A task failed with exception.
'module' object has no attribute 'BigqueryV2'
Traceback (most recent call last):
File "frombigquery.py", line 54, in <module>
run()
File "frombigquery.py", line 51, in run
result.wait_until_finish()
File "/Users/localuser/Virtualenvs/abeam/lib/python2.7/site-packages/apache_beam/runners/direct/direct_runner.py", line 157, in wait_until_finish
self._executor.await_completion()
File "/Users/localuser/Virtualenvs/abeam/lib/python2.7/site-packages/apache_beam/runners/direct/executor.py", line 335, in await_completion
self._executor.await_completion()
File "/Users/localuser/Virtualenvs/abeam/lib/python2.7/site-packages/apache_beam/runners/direct/executor.py", line 300, in __call__
result = evaluator.finish_bundle()
File "/Users/localuser/Virtualenvs/abeam/lib/python2.7/site-packages/apache_beam/runners/direct/transform_evaluator.py", line 208, in finish_bundle
with self._source.reader() as reader:
File "/Users/localuser/Virtualenvs/abeam/lib/python2.7/site-packages/apache_beam/io/gcp/bigquery.py", line 590, in __enter__
self.client = BigQueryWrapper(client=self.test_bigquery_client)
File "/Users/localuser/Virtualenvs/abeam/lib/python2.7/site-packages/apache_beam/io/gcp/bigquery.py", line 682, in __init__
self.client = client or bigquery.BigqueryV2(
AttributeError: 'module' object has no attribute 'BigqueryV2'
安装Apache Beam Python SDK时,您必须添加一个附加选项以使用与Google云平台相关的依赖项。
pip安装dist/apache beam-*。焦油gz【gcp】
我正在尝试使用Tensorflow在谷歌云上运行一个培训工作。我试图通过运行以下命令来运行培训。 但是当我运行一个作业时,我得到了以下错误。知道为什么吗?
问题内容: 我正在尝试使用Python下载网站的HTML源代码,但收到此错误。 我在这里遵循指南:http : //www.boddie.org.uk/python/HTML.html 我正在使用Python 3。 问题答案: 这适用于Python2.x。 对于Python 3,请在docs中查看:
变得非常困惑。四处寻找,但找不到任何有用的帮助。我知道错误了 回溯(最近一次调用):文件“/Users/Andrew/Desktop/password.py”,第2行,格式为cgi。Fieldstorage()AttributeError:“模块”对象没有属性“Fieldstorage”
问题内容: 我正在尝试运行一个简单的代码,并且在Canopy中安装了matplotlib和numpy的所有依赖项。还是我出错了。 错误: 我已经为numpy和matplotlib安装了依赖项:1.)libsvm-3.17.win64-py2.7 2.)pyparsing-2.0.3-1.win64-py2.7 3.)python-dateutil-2.4.2-2。 win64-py2.7 4.)p
问题内容: 我只是在做一个OpenCV中特征检测的例子。该示例如下所示。它给我以下错误 模块”对象没有属性“ drawMatches” 我已经检查了OpenCV文档,但不确定为什么会出现此错误。有人知道为什么吗? 错误: 问题答案: 该函数不是Python界面的一部分。 正如您在docs中看到的那样,它仅在当前定义。 摘录自文档: 如果该函数具有Python接口,则会发现以下内容: 编辑 实际上,
问题内容: 我正在尝试使用Selenium WebDriver生成Firefox实例。过去,在安装geckodriver并确保它位于我的PATH之后,我能够执行此操作。但是,我切换到使用phantomjs已有大约一年的时间,直到最近才决定再次试用Firefox。不幸的是,现在当我尝试实例化webdriver.Firefox对象时,我得到了AttributeError,称该对象没有称为“ Firef