当前位置: 首页 > 知识库问答 >
问题:

数据流作业图为空

宫晟
2023-03-14

我一直在尝试启动一些数据流作业,并设法让它们运行,但我无法查看作业图。这是一个问题,因为我正在尝试调试一些管道瓶颈,这使得调试有些困难。

我看到复杂的云数据流管道在开发人员控制台中没有显示执行图,但我认为它不适用于这种情况。

我用来启动的基本管道代码如下:

def run_pipeline(
    input_tfrecord, output_tfrecord, output_shards, config, filters, pipeline_options
):
    with beam.Pipeline(options=pipeline_options) as p:
        _ = (
            p
            | "read_input"
            >> beam.io.ReadFromTFRecord(
                input_tfrecord, coder=beam.coders.ProtoCoder(note_seq.NoteSequence)
            )
            #| "reshuffle_1" >> beam.Reshuffle() # Might help pardo parallelization
            | "extract_sequences" >> beam.ParDo(ExtractExamplesDoFn(config, filters))
            | "group" >> beam.GroupByKey()
            | "uniquify" >> beam.Map(combine_matching_seqs)
            | "shuffle" >> beam.Reshuffle()
            | "write"
            >> beam.io.WriteToTFRecord(
                output_tfrecord,
                num_shards=output_shards,
                coder=beam.coders.ProtoCoder(note_seq.NoteSequence),
            )
        )

我给出的管道选项是通过脚本执行的:

    python ./preprocess_tfrecords.py \
        --input_tfrecord=$INPUT_RECORD \
        --output_tfrecord=$OUTPUT_RECORD \
        --output_shards=$SHARDS \
        --config=$CONFIG \
        --alsologtostderr \
        --pipeline_options=--runner=DataFlowRunner,--project=ml-musicgeneration,--region=northamerica-northeast1,--temp_location=$TEMP_STORAGE,--sdk_container_image=$IMAGE_SRC

启动期间的日志包括如下

I0726 11:49:58.150444 140696227049664 auth.py:106] Setting socket default timeout to 60 seconds.
I0726 11:49:58.150666 140696227049664 auth.py:108] socket default timeout is 60.0 seconds.
I0726 11:49:58.155002 140696227049664 transport.py:157] Attempting refresh to obtain initial access_token
I0726 11:49:58.203473 140696227049664 client.py:777] Refreshing access_token
I0726 11:49:59.237261 140696227049664 stager.py:817] Downloading source distribution of the SDK from PyPi
I0726 11:49:59.237400 140696227049664 stager.py:824] Executing command: ['/home/matt/miniconda3/envs/magenta/bin/python', '-m', 'pip', 'download', '--dest', '/tmp/tmp8_6vsp9g', 'apache-beam==2.31.0', '--no-deps', '--no-binary', ':all:']
I0726 11:50:01.001361 140696227049664 stager.py:715] Staging SDK sources from PyPI: dataflow_python_sdk.tar
I0726 11:50:01.002918 140696227049664 stager.py:790] Downloading binary distribution of the SDK from PyPi
I0726 11:50:01.003030 140696227049664 stager.py:824] Executing command: ['/home/matt/miniconda3/envs/magenta/bin/python', '-m', 'pip', 'download', '--dest', '/tmp/tmp8_6vsp9g', 'apache-beam==2.31.0', '--no-deps', '--only-binary', ':all:', '--python-version', '38', '--implementation', 'cp', '--abi', 'cp38', '--platform', 'manylinux1_x86_64']
I0726 11:50:01.878797 140696227049664 stager.py:732] Staging binary distribution of the SDK from PyPI: apache_beam-2.31.0-cp38-cp38-manylinux1_x86_64.whl
W0726 11:50:01.881481 140696227049664 environments.py:371] Make sure that locally built Python SDK docker image has Python 3.8 interpreter.
I0726 11:50:01.881594 140696227049664 environments.py:380] Default Python SDK image for environment is apache/beam_python3.8_sdk:2.31.0
I0726 11:50:01.881756 140696227049664 environments.py:295] Using provided Python SDK container image: mattdeakos/musicai-dataflow-preproc:latest
I0726 11:50:01.881816 140696227049664 environments.py:302] Python SDK container image set to "mattdeakos/musicai-dataflow-preproc:latest" for Docker environment
I0726 11:50:01.991118 140696227049664 translations.py:636] ==================== <function pack_combiners at 0x7ff5cc9068b0> ====================
I0726 11:50:01.991888 140696227049664 translations.py:636] ==================== <function sort_stages at 0x7ff5cc9070d0> ====================
I0726 11:50:02.021321 140696227049664 apiclient.py:450] Defaulting to the temp_location as staging_location: gs://ml-musicgen/tmp
I0726 11:50:02.547729 140696227049664 apiclient.py:632] Starting GCS upload to gs://ml-musicgen/tmp/beamapp-matt-0726145002-021202.1627311002.021580/dataflow_python_sdk.tar...
I0726 11:50:05.002158 140696227049664 apiclient.py:648] Completed GCS upload to gs://ml-musicgen/tmp/beamapp-matt-0726145002-021202.1627311002.021580/dataflow_python_sdk.tar in 2 seconds.
I0726 11:50:05.002632 140696227049664 apiclient.py:632] Starting GCS upload to gs://ml-musicgen/tmp/beamapp-matt-0726145002-021202.1627311002.021580/apache_beam-2.31.0-cp38-cp38-manylinux1_x86_64.whl...
I0726 11:50:16.265783 140696227049664 apiclient.py:648] Completed GCS upload to gs://ml-musicgen/tmp/beamapp-matt-0726145002-021202.1627311002.021580/apache_beam-2.31.0-cp38-cp38-manylinux1_x86_64.whl in 11 seconds.
I0726 11:50:16.267052 140696227049664 apiclient.py:632] Starting GCS upload to gs://ml-musicgen/tmp/beamapp-matt-0726145002-021202.1627311002.021580/pipeline.pb...
I0726 11:50:16.578798 140696227049664 apiclient.py:648] Completed GCS upload to gs://ml-musicgen/tmp/beamapp-matt-0726145002-021202.1627311002.021580/pipeline.pb in 0 seconds.
I0726 11:50:17.390994 140696227049664 apiclient.py:794] Create job: <Job
 createTime: '2021-07-26T14:50:17.270892Z'
 currentStateTime: '1970-01-01T00:00:00Z'
 id: '2021-07-26_07_50_16-10262815125631987742'
 location: 'northamerica-northeast1'
 name: 'beamapp-matt-0726145002-021202'
 projectId: 'ml-musicgeneration'
 stageStates: []
 startTime: '2021-07-26T14:50:17.270892Z'
 steps: []
 tempFiles: []
 type: TypeValueValuesEnum(JOB_TYPE_BATCH, 1)>
I0726 11:50:17.391512 140696227049664 apiclient.py:796] Created job with id: [2021-07-26_07_50_16-10262815125631987742]
I0726 11:50:17.391600 140696227049664 apiclient.py:797] Submitted job: 2021-07-26_07_50_16-10262815125631987742
I0726 11:50:17.391734 140696227049664 apiclient.py:798] To access the Dataflow monitoring console, please navigate to https://console.cloud.google.com/dataflow/jobs/northamerica-northeast1/2021-07-26_07_50_16-10262815125631987742?project=ml-musicgeneration
I0726 11:50:17.597995 140696110667520 dataflow_runner.py:191] Job 2021-07-26_07_50_16-10262815125631987742 is in state JOB_STATE_PENDING
I0726 11:50:22.672355 140696110667520 dataflow_runner.py:236] 2021-07-26T14:50:17.010Z: JOB_MESSAGE_BASIC: Dataflow Runner V2 auto-enabled. Use --experiments=disable_runner_v2 to opt out.
I0726 11:50:22.672489 140696110667520 dataflow_runner.py:236] 2021-07-26T14:50:19.680Z: JOB_MESSAGE_DETAILED: Autoscaling is enabled for job 2021-07-26_07_50_16-10262815125631987742. The number of workers will be between 1 and 1000.
I0726 11:50:22.672572 140696110667520 dataflow_runner.py:236] 2021-07-26T14:50:19.701Z: JOB_MESSAGE_DETAILED: Autoscaling was automatically enabled for job 2021-07-26_07_50_16-10262815125631987742.
I0726 11:50:27.820690 140696110667520 dataflow_runner.py:236] 2021-07-26T14:50:22.927Z: JOB_MESSAGE_BASIC: Worker configuration: n1-standard-1 in northamerica-northeast1-a.
I0726 11:50:27.820837 140696110667520 dataflow_runner.py:236] 2021-07-26T14:50:23.841Z: JOB_MESSAGE_DETAILED: Expanding SplittableParDo operations into optimizable parts.
I0726 11:50:27.820913 140696110667520 dataflow_runner.py:236] 2021-07-26T14:50:23.851Z: JOB_MESSAGE_DETAILED: Expanding CollectionToSingleton operations into optimizable parts.
I0726 11:50:27.820959 140696110667520 dataflow_runner.py:236] 2021-07-26T14:50:23.880Z: JOB_MESSAGE_DETAILED: Expanding CoGroupByKey operations into optimizable parts.
I0726 11:50:27.821004 140696110667520 dataflow_runner.py:236] 2021-07-26T14:50:23.890Z: JOB_MESSAGE_DEBUG: Combiner lifting skipped for step write/Write/WriteImpl/GroupByKey: GroupByKey not followed by a combiner.
I0726 11:50:27.821047 140696110667520 dataflow_runner.py:236] 2021-07-26T14:50:23.899Z: JOB_MESSAGE_DEBUG: Combiner lifting skipped for step shuffle/ReshufflePerKey/GroupByKey: GroupByKey not followed by a combiner.
I0726 11:50:27.821088 140696110667520 dataflow_runner.py:236] 2021-07-26T14:50:23.907Z: JOB_MESSAGE_DEBUG: Combiner lifting skipped for step group: GroupByKey not followed by a combiner.
I0726 11:50:27.821128 140696110667520 dataflow_runner.py:236] 2021-07-26T14:50:23.919Z: JOB_MESSAGE_DETAILED: Expanding GroupByKey operations into optimizable parts.
I0726 11:50:27.821197 140696110667520 dataflow_runner.py:236] 2021-07-26T14:50:23.929Z: JOB_MESSAGE_DEBUG: Annotating graph with Autotuner information.
I0726 11:50:27.821238 140696110667520 dataflow_runner.py:236] 2021-07-26T14:50:23.980Z: JOB_MESSAGE_DETAILED: Fusing adjacent ParDo, Read, Write, and Flatten operations
I0726 11:50:27.821277 140696110667520 dataflow_runner.py:236] 2021-07-26T14:50:23.991Z: JOB_MESSAGE_DETAILED: Fusing consumer write/Write/WriteImpl/InitializeWrite into write/Write/WriteImpl/DoOnce/Map(decode)
I0726 11:50:27.821317 140696110667520 dataflow_runner.py:236] 2021-07-26T14:50:23.999Z: JOB_MESSAGE_DETAILED: Fusing consumer read_input/Read/Map(<lambda at iobase.py:894>) into read_input/Read/Impulse
I0726 11:50:27.821355 140696110667520 dataflow_runner.py:236] 2021-07-26T14:50:24.008Z: JOB_MESSAGE_DETAILED: Fusing consumer ref_AppliedPTransform_read_input-Read-SDFBoundedSourceReader-ParDo-SDFBoundedSourceDoFn-_7/PairWithRestriction into read_input/Read/Map(<lambda at iobase.py:894>)
I0726 11:50:27.821401 140696110667520 dataflow_runner.py:236] 2021-07-26T14:50:24.017Z: JOB_MESSAGE_DETAILED: Fusing consumer ref_AppliedPTransform_read_input-Read-SDFBoundedSourceReader-ParDo-SDFBoundedSourceDoFn-_7/SplitWithSizing into ref_AppliedPTransform_read_input-Read-SDFBoundedSourceReader-ParDo-SDFBoundedSourceDoFn-_7/PairWithRestriction
I0726 11:50:27.821443 140696110667520 dataflow_runner.py:236] 2021-07-26T14:50:24.025Z: JOB_MESSAGE_DETAILED: Fusing consumer extract_sequences into ref_AppliedPTransform_read_input-Read-SDFBoundedSourceReader-ParDo-SDFBoundedSourceDoFn-_7/ProcessElementAndRestrictionWithSizing
I0726 11:50:27.821482 140696110667520 dataflow_runner.py:236] 2021-07-26T14:50:24.034Z: JOB_MESSAGE_DETAILED: Fusing consumer group/Write into extract_sequences
I0726 11:50:27.821522 140696110667520 dataflow_runner.py:236] 2021-07-26T14:50:24.042Z: JOB_MESSAGE_DETAILED: Fusing consumer uniquify into group/Read
I0726 11:50:27.821561 140696110667520 dataflow_runner.py:236] 2021-07-26T14:50:24.051Z: JOB_MESSAGE_DETAILED: Fusing consumer shuffle/AddRandomKeys into uniquify
I0726 11:50:27.821600 140696110667520 dataflow_runner.py:236] 2021-07-26T14:50:24.071Z: JOB_MESSAGE_DETAILED: Fusing consumer shuffle/ReshufflePerKey/Map(reify_timestamps) into shuffle/AddRandomKeys
I0726 11:50:27.821670 140696110667520 dataflow_runner.py:236] 2021-07-26T14:50:24.079Z: JOB_MESSAGE_DETAILED: Fusing consumer shuffle/ReshufflePerKey/GroupByKey/Reify into shuffle/ReshufflePerKey/Map(reify_timestamps)
I0726 11:50:27.821709 140696110667520 dataflow_runner.py:236] 2021-07-26T14:50:24.088Z: JOB_MESSAGE_DETAILED: Fusing consumer shuffle/ReshufflePerKey/GroupByKey/Write into shuffle/ReshufflePerKey/GroupByKey/Reify
I0726 11:50:27.821749 140696110667520 dataflow_runner.py:236] 2021-07-26T14:50:24.097Z: JOB_MESSAGE_DETAILED: Fusing consumer shuffle/ReshufflePerKey/GroupByKey/GroupByWindow into shuffle/ReshufflePerKey/GroupByKey/Read
I0726 11:50:27.821788 140696110667520 dataflow_runner.py:236] 2021-07-26T14:50:24.106Z: JOB_MESSAGE_DETAILED: Fusing consumer shuffle/ReshufflePerKey/FlatMap(restore_timestamps) into shuffle/ReshufflePerKey/GroupByKey/GroupByWindow
I0726 11:50:27.821827 140696110667520 dataflow_runner.py:236] 2021-07-26T14:50:24.114Z: JOB_MESSAGE_DETAILED: Fusing consumer shuffle/RemoveRandomKeys into shuffle/ReshufflePerKey/FlatMap(restore_timestamps)
I0726 11:50:27.821866 140696110667520 dataflow_runner.py:236] 2021-07-26T14:50:24.123Z: JOB_MESSAGE_DETAILED: Fusing consumer write/Write/WriteImpl/ParDo(_RoundRobinKeyFn) into shuffle/RemoveRandomKeys
I0726 11:50:27.821905 140696110667520 dataflow_runner.py:236] 2021-07-26T14:50:24.131Z: JOB_MESSAGE_DETAILED: Fusing consumer write/Write/WriteImpl/DoOnce/FlatMap(<lambda at core.py:2979>) into write/Write/WriteImpl/DoOnce/Impulse
I0726 11:50:27.821973 140696110667520 dataflow_runner.py:236] 2021-07-26T14:50:24.139Z: JOB_MESSAGE_DETAILED: Fusing consumer write/Write/WriteImpl/DoOnce/Map(decode) into write/Write/WriteImpl/DoOnce/FlatMap(<lambda at core.py:2979>)
I0726 11:50:27.822011 140696110667520 dataflow_runner.py:236] 2021-07-26T14:50:24.148Z: JOB_MESSAGE_DETAILED: Fusing consumer write/Write/WriteImpl/WindowInto(WindowIntoFn) into write/Write/WriteImpl/ParDo(_RoundRobinKeyFn)
I0726 11:50:27.822051 140696110667520 dataflow_runner.py:236] 2021-07-26T14:50:24.167Z: JOB_MESSAGE_DETAILED: Fusing consumer write/Write/WriteImpl/GroupByKey/Write into write/Write/WriteImpl/WindowInto(WindowIntoFn)
I0726 11:50:27.822091 140696110667520 dataflow_runner.py:236] 2021-07-26T14:50:24.189Z: JOB_MESSAGE_DETAILED: Fusing consumer write/Write/WriteImpl/WriteBundles into write/Write/WriteImpl/GroupByKey/Read
I0726 11:50:27.822132 140696110667520 dataflow_runner.py:236] 2021-07-26T14:50:24.203Z: JOB_MESSAGE_DEBUG: Workflow config is missing a default resource spec.
I0726 11:50:27.822171 140696110667520 dataflow_runner.py:236] 2021-07-26T14:50:24.212Z: JOB_MESSAGE_DEBUG: Adding StepResource setup and teardown to workflow graph.
I0726 11:50:27.822209 140696110667520 dataflow_runner.py:236] 2021-07-26T14:50:24.221Z: JOB_MESSAGE_DEBUG: Adding workflow start and stop steps.
I0726 11:50:27.822247 140696110667520 dataflow_runner.py:236] 2021-07-26T14:50:24.230Z: JOB_MESSAGE_DEBUG: Assigning stage ids.
I0726 11:50:27.822287 140696110667520 dataflow_runner.py:236] 2021-07-26T14:50:24.290Z: JOB_MESSAGE_DEBUG: Executing wait step start35
I0726 11:50:27.822355 140696110667520 dataflow_runner.py:236] 2021-07-26T14:50:24.309Z: JOB_MESSAGE_BASIC: Executing operation write/Write/WriteImpl/DoOnce/Impulse+write/Write/WriteImpl/DoOnce/FlatMap(<lambda at core.py:2979>)+write/Write/WriteImpl/DoOnce/Map(decode)+write/Write/WriteImpl/InitializeWrite
I0726 11:50:27.822394 140696110667520 dataflow_runner.py:236] 2021-07-26T14:50:24.317Z: JOB_MESSAGE_BASIC: Executing operation read_input/Read/Impulse+read_input/Read/Map(<lambda at iobase.py:894>)+ref_AppliedPTransform_read_input-Read-SDFBoundedSourceReader-ParDo-SDFBoundedSourceDoFn-_7/PairWithRestriction+ref_AppliedPTransform_read_input-Read-SDFBoundedSourceReader-ParDo-SDFBoundedSourceDoFn-_7/SplitWithSizing
I0726 11:50:27.822435 140696110667520 dataflow_runner.py:236] 2021-07-26T14:50:24.331Z: JOB_MESSAGE_DEBUG: Starting worker pool setup.
I0726 11:50:27.822474 140696110667520 dataflow_runner.py:236] 2021-07-26T14:50:24.339Z: JOB_MESSAGE_BASIC: Starting 1 workers in northamerica-northeast1-a...
I0726 11:50:27.874108 140696110667520 dataflow_runner.py:191] Job 2021-07-26_07_50_16-10262815125631987742 is in state JOB_STATE_RUNNING
I0726 11:51:08.838329 140696110667520 dataflow_runner.py:236] 2021-07-26T14:51:05.862Z: JOB_MESSAGE_DETAILED: Autoscaling: Raised the number of workers to 1 based on the rate of progress in the currently running stage(s).
I0726 11:51:29.299031 140696110667520 dataflow_runner.py:236] 2021-07-26T14:51:26.103Z: JOB_MESSAGE_DETAILED: Workers have started successfully.
I0726 11:51:29.299267 140696110667520 dataflow_runner.py:236] 2021-07-26T14:51:26.113Z: JOB_MESSAGE_DETAILED: Workers have started successfully.
I0726 11:53:47.578707 140696110667520 dataflow_runner.py:236] 2021-07-26T14:53:43.292Z: JOB_MESSAGE_BASIC: Finished operation write/Write/WriteImpl/DoOnce/Impulse+write/Write/WriteImpl/DoOnce/FlatMap(<lambda at core.py:2979>)+write/Write/WriteImpl/DoOnce/Map(decode)+write/Write/WriteImpl/InitializeWrite
I0726 11:53:47.578846 140696110667520 dataflow_runner.py:236] 2021-07-26T14:53:43.312Z: JOB_MESSAGE_DEBUG: Value "write/Write/WriteImpl/DoOnce/Map(decode).None" materialized.
I0726 11:53:47.578928 140696110667520 dataflow_runner.py:236] 2021-07-26T14:53:43.320Z: JOB_MESSAGE_DEBUG: Value "write/Write/WriteImpl/InitializeWrite.None" materialized.
I0726 11:53:47.579004 140696110667520 dataflow_runner.py:236] 2021-07-26T14:53:43.338Z: JOB_MESSAGE_BASIC: Executing operation write/Write/WriteImpl/WriteBundles/View-python_side_input0-write/Write/WriteImpl/WriteBundles
I0726 11:53:47.579044 140696110667520 dataflow_runner.py:236] 2021-07-26T14:53:43.346Z: JOB_MESSAGE_BASIC: Executing operation write/Write/WriteImpl/FinalizeWrite/View-python_side_input0-write/Write/WriteImpl/FinalizeWrite
I0726 11:53:47.579083 140696110667520 dataflow_runner.py:236] 2021-07-26T14:53:43.354Z: JOB_MESSAGE_BASIC: Executing operation write/Write/WriteImpl/PreFinalize/View-python_side_input0-write/Write/WriteImpl/PreFinalize
I0726 11:53:47.579121 140696110667520 dataflow_runner.py:236] 2021-07-26T14:53:43.388Z: JOB_MESSAGE_BASIC: Finished operation write/Write/WriteImpl/WriteBundles/View-python_side_input0-write/Write/WriteImpl/WriteBundles
I0726 11:53:47.579159 140696110667520 dataflow_runner.py:236] 2021-07-26T14:53:43.398Z: JOB_MESSAGE_BASIC: Finished operation write/Write/WriteImpl/FinalizeWrite/View-python_side_input0-write/Write/WriteImpl/FinalizeWrite
I0726 11:53:47.579217 140696110667520 dataflow_runner.py:236] 2021-07-26T14:53:43.407Z: JOB_MESSAGE_DEBUG: Value "write/Write/WriteImpl/WriteBundles/View-python_side_input0-write/Write/WriteImpl/WriteBundles.out" materialized.
I0726 11:53:47.579256 140696110667520 dataflow_runner.py:236] 2021-07-26T14:53:43.407Z: JOB_MESSAGE_BASIC: Finished operation write/Write/WriteImpl/PreFinalize/View-python_side_input0-write/Write/WriteImpl/PreFinalize
I0726 11:53:47.579295 140696110667520 dataflow_runner.py:236] 2021-07-26T14:53:43.416Z: JOB_MESSAGE_DEBUG: Value "write/Write/WriteImpl/FinalizeWrite/View-python_side_input0-write/Write/WriteImpl/FinalizeWrite.out" materialized.
I0726 11:53:47.579345 140696110667520 dataflow_runner.py:236] 2021-07-26T14:53:43.426Z: JOB_MESSAGE_DEBUG: Value "write/Write/WriteImpl/PreFinalize/View-python_side_input0-write/Write/WriteImpl/PreFinalize.out" materialized.
I0726 11:53:52.691529 140696110667520 dataflow_runner.py:236] 2021-07-26T14:53:47.844Z: JOB_MESSAGE_BASIC: Finished operation read_input/Read/Impulse+read_input/Read/Map(<lambda at iobase.py:894>)+ref_AppliedPTransform_read_input-Read-SDFBoundedSourceReader-ParDo-SDFBoundedSourceDoFn-_7/PairWithRestriction+ref_AppliedPTransform_read_input-Read-SDFBoundedSourceReader-ParDo-SDFBoundedSourceDoFn-_7/SplitWithSizing
I0726 11:53:52.691735 140696110667520 dataflow_runner.py:236] 2021-07-26T14:53:47.863Z: JOB_MESSAGE_DEBUG: Value "ref_AppliedPTransform_read_input-Read-SDFBoundedSourceReader-ParDo-SDFBoundedSourceDoFn-_7-split-with-sizing-out3" materialized.
I0726 11:53:52.691854 140696110667520 dataflow_runner.py:236] 2021-07-26T14:53:47.881Z: JOB_MESSAGE_BASIC: Executing operation group/Create
I0726 11:53:52.691955 140696110667520 dataflow_runner.py:236] 2021-07-26T14:53:49.336Z: JOB_MESSAGE_BASIC: Finished operation group/Create
I0726 11:53:52.692048 140696110667520 dataflow_runner.py:236] 2021-07-26T14:53:49.355Z: JOB_MESSAGE_DEBUG: Value "group/Session" materialized.
I0726 11:53:52.692140 140696110667520 dataflow_runner.py:236] 2021-07-26T14:53:49.374Z: JOB_MESSAGE_BASIC: Executing operation ref_AppliedPTransform_read_input-Read-SDFBoundedSourceReader-ParDo-SDFBoundedSourceDoFn-_7/ProcessElementAndRestrictionWithSizing+extract_sequences+group/Write
I0726 11:54:08.043265 140696110667520 dataflow_runner.py:236] 2021-07-26T14:54:05.492Z: JOB_MESSAGE_DETAILED: Autoscaling: Resizing worker pool from 1 to 2.
I0726 11:54:23.392255 140696110667520 dataflow_runner.py:236] 2021-07-26T14:54:21.184Z: JOB_MESSAGE_DETAILED: Autoscaling: Raised the number of workers to 2 based on the rate of progress in the currently running stage(s).
I0726 11:58:34.425437 140696110667520 dataflow_runner.py:236] 2021-07-26T14:58:29.381Z: JOB_MESSAGE_DETAILED: Autoscaling: Reduced the number of workers to 1 based on the rate of progress in the currently running stage(s).
I0726 11:58:39.562591 140696110667520 dataflow_runner.py:236] 2021-07-26T14:58:34.394Z: JOB_MESSAGE_DETAILED: Autoscaling: Resizing worker pool from 2 to 1.

共有1个答案

沈德寿
2023-03-14

这个问题似乎与工作无关——我下载并安装了Google Chrome来登录控制台,而在我使用Firefox之前。这似乎解决了问题,我现在可以看到作业图。

 类似资料:
  • 当我在GCP中运行作业时,这工作很好,但如果没有任何更新,它将失败。如果我删除update标志,只要没有正在运行的作业,它就可以正常工作。 是否有一种方法来指定,如果作业存在,则更新它,而不仅仅是开始一个新的作业?

  • 在我当前的架构中,多个数据流作业在不同阶段被触发,作为ABC框架的一部分,我需要捕获这些作业的作业id作为数据流管道中的审计指标,并在BigQuery中更新它。 如何使用JAVA从管道中获取数据流作业的运行id?有没有我可以使用的现有方法,或者我是否需要在管道中使用google cloud的客户端库?

  • 我当前正尝试将Dataflow与pub/sub一起使用,但出现以下错误: 工作流失败。原因:(6E74E8516C0638CA):刷新凭据时出现问题。请检查:1。已为项目启用Dataflow API。2.您的项目有一个机器人服务帐户:service-[project number]@dataflow-service-producer-prod.iam.gserviceAccount.com应该可以

  • 我已经开始使用Scala SDK Scio开发我的第一个DataFlow工作。数据流作业将以流模式运行。 有谁能建议最好的部署方法吗?我已经在Scio文档中阅读了他们使用的,然后将其部署到Docker容器中。我也读过关于使用数据流模板的内容(但不是很详细)。 什么是最好的?

  • 使用“file_loads”技术通过Apache Beam数据流作业写入BigQuery时出错。流式插入(else块)工作正常,符合预期。file_load(如果块)失败,错误在代码后面给出。bucket中GCS上的临时文件是有效的JSON对象。 来自pub/sub的原始事件示例: 数据流作业出错:

  • 我一直在运行基于12月创建的模板的数据流作业,该模板在运行时传递一些参数,没有任何问题。我现在不得不对模板做了一些修改,我似乎在生成一个工作模板时遇到了问题,即使使用和以前一样的beam代码/版本。我的工作只是无限期地挂起-尝试离开一个,大约一个小时后超时。 当然有一个问题,因为即使是我创建空PCollection的第一步也没有成功,它只是说运行。 我已经从函数中抽象出来,以解决问题可能是什么,因

  • 我试图从一个数据流作业中运行两个分离的管道,类似于下面的问题: 一个数据流作业中的并行管道 如果我们使用单个p.run()使用单个数据流作业运行两个分离的管道,如下所示: 我认为它将在一个数据流作业中启动两个独立的管道,但它会创建两个包吗?它会在两个不同的工人上运行吗?