Free Professional Data Engineer Exam Braindumps (page: 18)

Page 18 of 68

When a Cloud Bigtable node fails, ____ is lost.

  1. all data
  2. no data
  3. the last transaction
  4. the time dimension

Answer(s): B

Explanation:

A Cloud Bigtable table is sharded into blocks of contiguous rows, called tablets, to help balance the workload of queries. Tablets are stored on Colossus, Google's file system, in SSTable format. Each tablet is associated with a specific Cloud Bigtable node. Data is never stored in Cloud Bigtable nodes themselves; each node has pointers to a set of tablets that are stored on Colossus. As a result:
Rebalancing tablets from one node to another is very fast, because the actual data is not copied. Cloud Bigtable simply updates the pointers for each node. Recovery from the failure of a Cloud Bigtable node is very fast, because only metadata needs to be migrated to the replacement node.
When a Cloud Bigtable node fails, no data is lost


Reference:

https://cloud.google.com/bigtable/docs/overview



Which role must be assigned to a service account used by the virtual machines in a Dataproc cluster so they can execute jobs?

  1. Dataproc Worker
  2. Dataproc Viewer
  3. Dataproc Runner
  4. Dataproc Editor

Answer(s): A

Explanation:

Service accounts used with Cloud Dataproc must have Dataproc/Dataproc Worker role (or have all the permissions granted by Dataproc Worker role).


Reference:

https://cloud.google.com/dataproc/docs/concepts/service- accounts#important_notes



Which is not a valid reason for poor Cloud Bigtable performance?

  1. The workload isn't appropriate for Cloud Bigtable.
  2. The table's schema is not designed correctly.
  3. The Cloud Bigtable cluster has too many nodes.
  4. There are issues with the network connection.

Answer(s): C

Explanation:

The Cloud Bigtable cluster doesn't have enough nodes. If your Cloud Bigtable cluster is overloaded, adding more nodes can improve performance. Use the monitoring tools to check whether the cluster is overloaded.


Reference:

https://cloud.google.com/bigtable/docs/performance



Which of the following job types are supported by Cloud Dataproc (select 3 answers)?

  1. Hive
  2. Pig
  3. YARN
  4. Spark

Answer(s): A,B,D

Explanation:

Cloud Dataproc provides out-of-the box and end-to-end support for many of the most popular job types, including Spark, Spark SQL, PySpark, MapReduce, Hive, and Pig jobs.


Reference:

https://cloud.google.com/dataproc/docs/resources/faq#what_type_of_jobs_can_i_run



Page 18 of 68



Post your Comments and Discuss Google Professional Data Engineer exam with other Community members:

madhan commented on June 16, 2023
next question
EUROPEAN UNION
upvote