Free Professional Data Engineer Exam Braindumps (page: 2)

Page 2 of 68

You work for a car manufacturer and have set up a data pipeline using Google Cloud Pub/Sub to capture anomalous sensor events. You are using a push subscription in Cloud Pub/Sub that calls a custom HTTPS endpoint that you have created to take action of these anomalous events as they occur. Your custom HTTPS endpoint keeps getting an inordinate amount of duplicate messages.
What is the most likely cause of these duplicate messages?

  1. The message body for the sensor event is too large.
  2. Your custom endpoint has an out-of-date SSL certificate.
  3. The Cloud Pub/Sub topic has too many messages published to it.
  4. Your custom endpoint is not acknowledging messages within the acknowledgement deadline.

Answer(s): B

You designed a database for patient records as a pilot project to cover a few hundred patients in three clinics. Your design used a single database table to represent all patients and their visits, and you used self-joins to generate reports. The server resource utilization was at 50%. Since then, the scope of the project has expanded. The database must now store 100 times more patient records. You can no longer run the reports, because they either take too long or they encounter errors with insufficient compute resources. How should you adjust the database design?

  1. Add capacity (memory and disk space) to the database server by the order of 200.
  2. Shard the tables into smaller ones based on date ranges, and only generate reports with prespecified date ranges.
  3. Normalize the master patient-record table into the patient table and the visits table, and create other necessary tables to avoid self-join.
  4. Partition the table into smaller tables, with one for each clinic. Run queries against the smaller table pairs, and use unions for consolidated reports.

Answer(s): C

Your company's on-premises Apache Hadoop servers are approaching end-of-life, and IT has decided to migrate the cluster to Google Cloud Dataproc. A like-for-like migration of the cluster would require 50 TB of Google Persistent Disk per node. The CIO is concerned about the cost of using that much block storage. You want to minimize the storage cost of the migration.
What should you do?

  1. Put the data into Google Cloud Storage.
  2. Use preemptible virtual machines (VMs) for the Cloud Dataproc cluster.
  3. Tune the Cloud Dataproc cluster so that there is just enough disk for all data.
  4. Migrate some of the cold data into Google Cloud Storage, and keep only the hot data in Persistent Disk.

Answer(s): B

Your company is using WHILECARD tables to query data across multiple tables with similar names. The SQL statement is currently failing with the following error:

# Syntax error : Expected end of statement but got "-" at [4:11]


age != 99
age DESC

Which table name will make the SQL statement work correctly?

  1. `bigquery-public-data.noaa_gsod.gsod`
  2. bigquery-public-data.noaa_gsod.gsod*
  3. `bigquery-public-data.noaa_gsod.gsod'*
  4. `bigquery-public-data.noaa_gsod.gsod*`

Answer(s): D

Page 2 of 68

Post your Comments and Discuss Google Professional Data Engineer exam with other Community members:

madhan 6/16/2023 6:22:08 AM
next question