Free Professional Data Engineer Exam Braindumps (page: 23)

Page 23 of 68

Which Java SDK class can you use to run your Dataflow programs locally?

  1. LocalRunner
  2. DirectPipelineRunner
  3. MachineRunner
  4. LocalPipelineRunner

Answer(s): B

Explanation:

DirectPipelineRunner allows you to execute operations in the pipeline directly, without any optimization. Useful for small local execution and tests


Reference:

https://cloud.google.com/dataflow/java- sdk/JavaDoc/com/google/cloud/dataflow/sdk/runners/DirectPipelineRunner



You are developing a software application using Google's Dataflow SDK, and want to use conditional, for loops and other complex programming structures to create a branching
pipeline.
Which component will be used for the data processing operation?

  1. PCollection
  2. Transform
  3. Pipeline
  4. Sink API

Answer(s): B

Explanation:

In Google Cloud, the Dataflow SDK provides a transform component. It is responsible for the data processing operation. You can use conditional, for loops, and other complex programming structure to create a branching pipeline.


Reference:

https://cloud.google.com/dataflow/model/programming-model



Which is the preferred method to use to avoid hotspotting in time series data in Bigtable?

  1. Field promotion
  2. Randomization
  3. Salting
  4. Hashing

Answer(s): A

Explanation:

By default, prefer field promotion. Field promotion avoids hotspotting in almost all cases, and it tends to make it easier to design a row key that facilitates queries.


Reference:

https://cloud.google.com/bigtable/docs/schema-design-time- series#ensure_that_your_row_key_avoids_hotspotting



You have a job that you want to cancel. It is a streaming pipeline, and you want to ensure that any data that is in-flight is processed and written to the output.
Which of the following commands can you use on the Dataflow monitoring console to stop the pipeline job?

  1. Cancel
  2. Drain
  3. Stop
  4. Finish

Answer(s): B

Explanation:

Using the Drain option to stop your job tells the Dataflow service to finish your job in its current state. Your job will immediately stop ingesting new data from input sources, but the Dataflow
service will preserve any existing resources (such as worker instances) to finish processing and writing any buffered data in your pipeline.


Reference:

https://cloud.google.com/dataflow/pipelines/stopping-a-pipeline



Page 23 of 68



Post your Comments and Discuss Google Professional Data Engineer exam with other Community members:

madhan commented on June 16, 2023
next question
EUROPEAN UNION
upvote