Free Google PROFESSIONAL-DATA-ENGINEER Exam Questions (page: 8)

You have Google Cloud Dataflow streaming pipeline running with a Google Cloud Pub/Sub subscription as the source. You need to make an update to the code that will make the new Cloud Dataflow pipeline incompatible with the current version. You do not want to lose any data when making this update.
What should you do?

  1. Update the current pipeline and use the drain flag.
  2. Update the current pipeline and provide the transform mapping JSON object.
  3. Create a new pipeline that has the same Cloud Pub/Sub subscription and cancel the old pipeline.
  4. Create a new pipeline that has a new Cloud Pub/Sub subscription and cancel the old pipeline.

Answer(s): D



Your company is running their first dynamic campaign, serving different offers by analyzing real-time data during the holiday season. The data scientists are collecting terabytes of data that rapidly grows every hour during their 30-day campaign. They are using Google Cloud Dataflow to preprocess the data and collect the feature (signals) data that is needed for the machine learning model in Google Cloud Bigtable. The team is observing suboptimal performance with reads and writes of their initial load of 10 TB of dat

  1. They want to improve this performance while minimizing cost.
    What should they do?
  2. Redefine the schema by evenly distributing reads and writes across the row space of the table.
  3. The performance issue should be resolved over time as the site of the BigDate cluster is increased.
  4. Redesign the schema to use a single row key to identify values that need to be updated frequently in the cluster.
  5. Redesign the schema to use row keys based on numeric IDs that increase sequentially per user viewing the offers.

Answer(s): A



Your software uses a simple JSON format for all messages. These messages are published to Google Cloud Pub/Sub, then processed with Google Cloud Dataflow to create a real-time dashboard for the CFO. During testing, you notice that some messages are missing in the dashboard. You check the logs, and all messages are being published to Cloud Pub/Sub successfully.
What should you do next?

  1. Check the dashboard application to see if it is not displaying correctly.
  2. Run a fixed dataset through the Cloud Dataflow pipeline and analyze the output.
  3. Use Google Stackdriver Monitoring on Cloud Pub/Sub to find the missing messages.
  4. Switch Cloud Dataflow to pull messages from Cloud Pub/Sub instead of Cloud Pub/Sub pushing messages to Cloud Dataflow.

Answer(s): B

Explanation:



View Related Case Study

Flowlogistic wants to use Google BigQuery as their primary analysis system, but they still have Apache Hadoop and Spark workloads that they cannot move to BigQuery. Flowlogistic does not know how to store the data that is common to both workloads.
What should they do?

  1. Store the common data in BigQuery as partitioned tables.
  2. Store the common data in BigQuery and expose authorized views.
  3. Store the common data encoded as Avro in Google Cloud Storage.
  4. Store he common data in the HDFS storage for a Google Cloud Dataproc cluster.

Answer(s): B



View Related Case Study

Flowlogistic's management has determined that the current Apache Kafka servers cannot handle the data volume for their real-time inventory tracking system. You need to build a new system on Google Cloud Platform (GCP) that will feed the proprietary tracking software. The system must be able to ingest data from a variety of global sources, process and query in real-time, and store the data reliably.
Which combination of GCP products should you choose?

  1. Cloud Pub/Sub, Cloud Dataflow, and Cloud Storage
  2. Cloud Pub/Sub, Cloud Dataflow, and Local SSD
  3. Cloud Pub/Sub, Cloud SQL, and Cloud Storage
  4. Cloud Load Balancing, Cloud Dataflow, and Cloud Storage

Answer(s): C



Viewing page 8 of 78
Viewing questions 36 - 40 out of 384 questions



Post your Comments and Discuss Google PROFESSIONAL-DATA-ENGINEER exam prep with other Community members:

PROFESSIONAL-DATA-ENGINEER Exam Discussions & Posts