Free Professional Data Engineer Exam Braindumps (page: 38)

Page 38 of 68

A data scientist has created a BigQuery ML model and asks you to create an ML pipeline to serve predictions. You have a REST API application with the requirement to serve predictions for an individual user ID with latency under 100 milliseconds. You use the following query to generate predictions: SELECT predicted_label, user_id FROM ML.PREDICT (MODEL `dataset.model', table user_features). How should you create the ML pipeline?

  1. Add a WHERE clause to the query, and grant the BigQuery Data Viewer role to the application service account.
  2. Create an Authorized View with the provided query. Share the dataset that contains the view with the application service account.
  3. Create a Cloud Dataflow pipeline using BigQueryIO to read results from the query. Grant the Dataflow Worker role to the application service account.
  4. Create a Cloud Dataflow pipeline using BigQueryIO to read predictions for all users from the query. Write the results to Cloud Bigtable using BigtableIO. Grant the Bigtable Reader role to the application service account so that the application can read predictions for individual users from Cloud Bigtable.

Answer(s): D



You work for a bank. You have a labelled dataset that contains information on already granted loan application and whether these applications have been defaulted. You have been asked to train a model to predict default rates for credit applicants.

What should you do?

  1. Increase the size of the dataset by collecting additional data.
  2. Train a linear regression to predict a credit default risk score.
  3. Remove the bias from the data and collect applications that have been declined loans.
  4. Match loan applicants with their social profiles to enable feature engineering.

Answer(s): B



Government regulations in your industry mandate that you have to maintain an auditable record of access to certain types of datA. Assuming that all expiring logs will be archived correctly, where should you store data that is subject to that mandate?

  1. Encrypted on Cloud Storage with user-supplied encryption keys. A separate decryption key will be given to each authorized user.
  2. In a BigQuery dataset that is viewable only by authorized personnel, with the Data Access log used to
    provide the auditability.
  3. In Cloud SQL, with separate database user names to each user. The Cloud SQL Admin activity logs will be used to provide the auditability.
  4. In a bucket on Cloud Storage that is accessible only by an AppEngine service that collects user information and logs the access before providing a link to the bucket.

Answer(s): B



You are selecting services to write and transform JSON messages from Cloud Pub/Sub to BigQuery for a data pipeline on Google Cloud. You want to minimize service costs. You also want to monitor and accommodate input data volume that will vary in size with minimal manual intervention.
What should you do?

  1. Use Cloud Dataproc to run your transformations. Monitor CPU utilization for the cluster. Resize the number of worker nodes in your cluster via the command line.
  2. Use Cloud Dataproc to run your transformations. Use the diagnose command to generate an operational output archive. Locate the bottleneck and adjust cluster resources.
  3. Use Cloud Dataflow to run your transformations. Monitor the job system lag with Stackdriver. Use the
    default autoscaling setting for worker instances.
  4. Use Cloud Dataflow to run your transformations. Monitor the total execution time for a sampling of jobs. Configure the job to use non-default Compute Engine machine types when needed.

Answer(s): B



Page 38 of 68



Post your Comments and Discuss Google Professional Data Engineer exam with other Community members:

madhan commented on June 16, 2023
next question
EUROPEAN UNION
upvote