Free Professional Data Engineer Exam Braindumps (page: 15)

Page 15 of 68

Which Cloud Dataflow / Beam feature should you use to aggregate data in an unbounded data source every hour based on the time when the data entered the pipeline?

  1. An hourly watermark
  2. An event time trigger
  3. The with Allowed Lateness method
  4. A processing time trigger

Answer(s): D

Explanation:

When collecting and grouping data into windows, Beam uses triggers to determine when to
emit the aggregated results of each window.
Processing time triggers. These triggers operate on the processing time ­ the time when the data element is processed at any given stage in the pipeline. Event time triggers. These triggers operate on the event time, as indicated by the timestamp on each data element. Beam's default trigger is event time-based.


Reference:

https://beam.apache.org/documentation/programming-guide/#triggers



The CUSTOM tier for Cloud Machine Learning Engine allows you to specify the number of which types of cluster nodes?

  1. Workers
  2. Masters, workers, and parameter servers
  3. Workers and parameter servers
  4. Parameter servers

Answer(s): C

Explanation:

The CUSTOM tier is not a set tier, but rather enables you to use your own cluster specification.
When you use this tier, set values to configure your processing cluster according to these guidelines:
You must set TrainingInput.masterType to specify the type of machine to use for your master node.
You may set TrainingInput.workerCount to specify the number of workers to use. You may set TrainingInput.parameterServerCount to specify the number of parameter servers to use.
You can specify the type of machine for the master node, but you can't specify more than one master node.


Reference:

https://cloud.google.com/ml-engine/docs/training- overview#job_configuration_parameters



Cloud Bigtable is Google's ______ Big Data database service.

  1. Relational
  2. mySQL
  3. NoSQL
  4. SQL Server

Answer(s): C

Explanation:

Cloud Bigtable is Google's NoSQL Big Data database service. It is the same database that Google uses for services, such as Search, Analytics, Maps, and Gmail. It is used for requirements that are low latency and high throughput including Internet of Things (IoT), user analytics, and financial data analysis.


Reference:

https://cloud.google.com/bigtable/



When you store data in Cloud Bigtable, what is the recommended minimum amount of stored data?

  1. 500 TB
  2. 1 GB
  3. 1 TB
  4. 500 GB

Answer(s): C

Explanation:

Cloud Bigtable is not a relational database. It does not support SQL queries, joins, or multi- row transactions. It is not a good solution for less than 1 TB of data.


Reference:

https://cloud.google.com/bigtable/docs/overview#title_short_and_other_storage_options



Page 15 of 68



Post your Comments and Discuss Google Professional Data Engineer exam with other Community members:

madhan commented on June 16, 2023
next question
EUROPEAN UNION
upvote