Free Professional Data Engineer Exam Braindumps (page: 27)

Page 26 of 95

What are the minimum permissions needed for a service account used with Google Dataproc?

  1. Execute to Google Cloud Storage; write to Google Cloud Logging
  2. Write to Google Cloud Storage; read to Google Cloud Logging
  3. Execute to Google Cloud Storage; execute to Google Cloud Logging
  4. Read and write to Google Cloud Storage; write to Google Cloud Logging

Answer(s): D

Explanation:

Service accounts authenticate applications running on your virtual machine instances to other Google Cloud Platform services. For example, if you write an application that reads and writes files on Google Cloud Storage, it must first authenticate to the Google Cloud Storage API. At a minimum, service accounts used with Cloud Dataproc need permissions to read and write to Google Cloud Storage, and to write to Google Cloud Logging.


Reference:

https://cloud.google.com/dataproc/docs/concepts/service-accounts#important_notes



Which role must be assigned to a service account used by the virtual machines in a Dataproc cluster so they can execute jobs?

  1. Dataproc Worker
  2. Dataproc Viewer
  3. Dataproc Runner
  4. Dataproc Editor

Answer(s): A

Explanation:

Service accounts used with Cloud Dataproc must have Dataproc/Dataproc Worker role (or have all the permissions granted by Dataproc Worker role).


Reference:

https://cloud.google.com/dataproc/docs/concepts/service-accounts#important_notes



When creating a new Cloud Dataproc cluster with the projects.regions.clusters.create operation, these four values are required: project, region, name, and ____.

  1. zone
  2. node
  3. label
  4. type

Answer(s): A

Explanation:

At a minimum, you must specify four values when creating a new cluster with the projects.regions.clusters.create operation:

The project in which the cluster will be created

The region to use

The name of the cluster

The zone in which the cluster will be created

You can specify many more details beyond these minimum requirements. For example, you can also specify the number of workers, whether preemptible compute should be used, and the network settings.


Reference:

https://cloud.google.com/dataproc/docs/tutorials/python-library- example#create_a_new_cloud_dataproc_cluste



Which Google Cloud Platform service is an alternative to Hadoop with Hive?

  1. Cloud Dataflow
  2. Cloud Bigtable
  3. BigQuery
  4. Cloud Datastore

Answer(s): C

Explanation:

Apache Hive is a data warehouse software project built on top of Apache Hadoop for providing data summarization, query, and analysis.

Google BigQuery is an enterprise data warehouse.


Reference:

https://en.wikipedia.org/wiki/Apache_Hive






Post your Comments and Discuss Google Professional Data Engineer exam with other Community members:

Exam Discussions & Posts