QUESTION: 17 Exam Topic: 1, Main Questions Set A

Your company is migrating their 30-node Apache Hadoop cluster to the cloud. They want to re-use Hadoop jobs they have already created and minimize the management of the cluster as much as possible. They also want to be able to persist data beyond the life of the cluster.
What should you do?

Create a Google Cloud Dataflow job to process the data.
Create a Google Cloud Dataproc cluster that uses persistent disks for HDFS.
Create a Hadoop cluster on Google Compute Engine that uses persistent disks.
Create a Cloud Dataproc cluster that uses the Google Cloud Storage connector.
Create a Hadoop cluster on Google Compute Engine that uses Local SSD disks.

Answer(s): D

Show Answer Next Question

QUESTION: 18 Exam Topic: 1, Main Questions Set A

Business owners at your company have given you a database of bank transactions. Each row contains the user ID, transaction type, transaction location, and transaction amount. They ask you to investigate what type of machine learning can be applied to the dat

Which three machine learning applications can you use? (Choose three.)
Supervised learning to determine which transactions are most likely to be fraudulent.
Unsupervised learning to determine which transactions are most likely to be fraudulent.
Clustering to divide the transactions into N categories based on feature similarity.
Supervised learning to predict the location of a transaction.
Reinforcement learning to predict the location of a transaction.
Unsupervised learning to predict the location of a transaction.

Answer(s): B,C,D

Show Answer Next Question

QUESTION: 19 Exam Topic: 1, Main Questions Set A

Your company's on-premises Apache Hadoop servers are approaching end-of-life, and IT has decided to migrate the cluster to Google Cloud Dataproc. A like-for-like migration of the cluster would require 50 TB of Google Persistent Disk per node. The CIO is concerned about the cost of using that much block storage. You want to minimize the storage cost of the migration.
What should you do?

Put the data into Google Cloud Storage.
Use preemptible virtual machines (VMs) for the Cloud Dataproc cluster.
Tune the Cloud Dataproc cluster so that there is just enough disk for all data.
Migrate some of the cold data into Google Cloud Storage, and keep only the hot data in Persistent Disk.

Answer(s): B

Reference:

Show Answer Next Question

QUESTION: 20 Exam Topic: 1, Main Questions Set A

You work for a car manufacturer and have set up a data pipeline using Google Cloud Pub/Sub to capture anomalous sensor events. You are using a push subscription in Cloud Pub/Sub that calls a custom HTTPS endpoint that you have created to take action of these anomalous events as they occur. Your custom HTTPS endpoint keeps getting an inordinate amount of duplicate messages.
What is the most likely cause of these duplicate messages?

The message body for the sensor event is too large.
Your custom endpoint has an out-of-date SSL certificate.
The Cloud Pub/Sub topic has too many messages published to it.
Your custom endpoint is not acknowledging messages within the acknowledgement deadline.

Answer(s): B

Show Answer Next Question

Free Google Google Cloud Data Engineer Professional Exam Braindumps (page: 6)

QUESTION: 17 Exam Topic: 1, Main Questions Set A

QUESTION: 18 Exam Topic: 1, Main Questions Set A

QUESTION: 19 Exam Topic: 1, Main Questions Set A

Reference:

QUESTION: 20 Exam Topic: 1, Main Questions Set A

Google Cloud Data Engineer Professional Exam Discussions & Posts