Free Google Google Cloud Certified Professional Data Engineer Exam Questions (page: 52)

Why do you need to split a machine learning dataset into training data and test data?

  1. So you can try two different sets of features
  2. To make sure your model is generalized for more than just the training data
  3. To allow you to create unit tests in your code
  4. So you can use one dataset for a wide model and one for a deep model

Answer(s): B

Explanation:

The flaw with evaluating a predictive model on training data is that it does not inform you on how well the model has generalized to new unseen data. A model that is selected for its accuracy on the training dataset rather than its accuracy on an unseen test dataset is very likely to have lower accuracy on an unseen test dataset. The reason is that the model is not as generalized. It has specialized to the structure in the training dataset. This is called overfitting.


Reference:

https://machinelearningmastery.com/a-simple-intuition-for-overfitting/



Which of these numbers are adjusted by a neural network as it learns from a training dataset (select 2 answers)?

  1. Weights
  2. Biases
  3. Continuous features
  4. Input values

Answer(s): A,B

Explanation:

A neural network is a simple mechanism that's implemented with basic math. The only difference between the traditional programming model and a neural network is that you let the computer determine the parameters (weights and bias) by learning from training datasets.


Reference:

https://cloud.google.com/blog/big-data/2016/07/understanding-neural-networks-with- tensorflow-playground



The CUSTOM tier for Cloud Machine Learning Engine allows you to specify the number of which types of cluster nodes?

  1. Workers
  2. Masters, workers, and parameter servers
  3. Workers and parameter servers
  4. Parameter servers

Answer(s): C

Explanation:

The CUSTOM tier is not a set tier, but rather enables you to use your own cluster specification.
When you use this tier, set values to configure your processing cluster according to these guidelines:

You must set TrainingInput.masterType to specify the type of machine to use for your master node.

You may set TrainingInput.workerCount to specify the number of workers to use.

You may set TrainingInput.parameterServerCount to specify the number of parameter servers to use.

You can specify the type of machine for the master node, but you can't specify more than one master node.


Reference:

https://cloud.google.com/ml-engine/docs/training- overview#job_configuration_parameters



Which software libraries are supported by Cloud Machine Learning Engine?

  1. Theano and TensorFlow
  2. Theano and Torch
  3. TensorFlow
  4. TensorFlow and Torch

Answer(s): C

Explanation:

Cloud ML Engine mainly does two things:

Enables you to train machine learning models at scale by running TensorFlow training applications in the cloud.

Hosts those trained models for you in the cloud so that you can use them to get predictions about new data.


Reference:

https://cloud.google.com/ml-engine/docs/technical-overview#what_it_does



Viewing page 52 of 95



Post your Comments and Discuss Google Google Cloud Certified Professional Data Engineer exam prep with other Community members:

Google Cloud Certified Professional Data Engineer Exam Discussions & Posts