Free Google Professional Machine Learning Engineer Exam Braindumps (page: 13)

You work for a social media company. You need to detect whether posted images contain cars. Each training example is a member of exactly one class. You have trained an object detection neural network and deployed the model version to Al Platform Prediction for evaluation. Before deployment, you created an evaluation job and attached it to the Al Platform Prediction model version. You notice that the precision is lower than your business requirements allow. How should you adjust the model's final layer softmax threshold to increase precision?

  1. Increase the recall
  2. Decrease the recall.
  3. Increase the number of false positives
  4. Decrease the number of false negatives

Answer(s): B

Explanation:

Precision and recall are two common metrics for evaluating the performance of a classification model. Precision measures the proportion of positive predictions that are correct, while recall measures the proportion of positive examples that are correctly predicted. Precision and recall are inversely related, meaning that increasing one will decrease the other, and vice versa. The trade-off between precision and recall depends on the goal and the cost of the classification problem1. For the use case of detecting whether posted images contain cars, precision is more important than recall, as the social media company wants to minimize the number of false positives, or images that are incorrectly labeled as containing cars. A high precision means that the model is confident and accurate in its positive predictions, while a low recall means that the model may miss some positive examples, or images that actually contain cars. The cost of missing some positive examples is lower than the cost of making wrong positive predictions, as the latter may affect the user experience and the reputation of the social media company.
The softmax function is a function that transforms a vector of real numbers into a probability distribution over the possible classes. The softmax function is often used as the final layer of a neural network for multi-class classification problems, as it assigns a probability to each class, and the class with the highest probability is chosen as the prediction. The softmax function is defined as:
softmax (x_i) exp (x_i) / sum_j exp (x_j)
where x_i is the input value for class i, and softmax (x_i) is the output probability for class i. The softmax threshold is a parameter that determines the minimum probability that a class must have to be chosen as the prediction. For example, if the softmax threshold is 0.5, then the class with the highest probability must have at least 0.5 to be selected, otherwise the prediction is none. The softmax threshold can be used to adjust the trade-off between precision and recall, as a higher threshold will increase the precision and decrease the recall, while a lower threshold will decrease the precision and increase the recall2.
For the use case of detecting whether posted images contain cars, the best way to adjust the model's final layer softmax threshold to increase precision is to decrease the recall. This means that the softmax threshold should be increased, so that the model will only make positive predictions when it is highly confident, and avoid making false positives. By increasing the softmax threshold, the model will become more selective and accurate in its positive predictions, and improve the precision metric. Therefore, decreasing the recall is the best option for this use case.


Reference:

Precision and recall - Wikipedia
How to add a threshold in softmax scores - Stack Overflow



You developed an ML model with Al Platform, and you want to move it to production. You serve a few thousand queries per second and are experiencing latency issues. Incoming requests are served by a load balancer that distributes them across multiple Kubeflow CPU-only pods running on Google Kubernetes Engine (GKE). Your goal is to improve the serving latency without changing the underlying infrastructure.
What should you do?

  1. Significantly increase the max_batch_size TensorFlow Serving parameter
  2. Switch to the tensorflow-model-server-universal version of TensorFlow Serving
  3. Significantly increase the max_enqueued_batches TensorFlow Serving parameter
  4. Recompile TensorFlow Serving using the source to support CPU-specific optimizations Instruct GKE to choose an appropriate baseline minimum CPU platform for serving nodes

Answer(s): D

Explanation:

TensorFlow Serving is a service that allows you to deploy and serve TensorFlow models in a scalable and efficient way. TensorFlow Serving supports various platforms and hardware, such as CPU, GPU, and TPU. However, the default TensorFlow Serving binaries are built with generic CPU instructions, which may not leverage the full potential of the CPU architecture. To improve the serving latency and performance, you can recompile TensorFlow Serving using the source code and enable CPU-specific optimizations, such as AVX, AVX2, and FMA1. These optimizations can speed up the computation and inference of the TensorFlow models, especially for deep neural networks. Google Kubernetes Engine (GKE) is a service that allows you to run and manage containerized applications on Google Cloud using Kubernetes. GKE supports various types and sizes of nodes, which are the virtual machines that run the containers. GKE also supports different CPU platforms, which are the generations and models of the CPUs that power the nodes. GKE allows you to choose a baseline minimum CPU platform for your node pool, which is a group of nodes with the same configuration. By choosing a baseline minimum CPU platform, you can ensure that your nodes have the CPU features and capabilities that match your workload requirements2. For the use case of serving a few thousand queries per second and experiencing latency issues, the best option is to recompile TensorFlow Serving using the source to support CPU-specific optimizations, and instruct GKE to choose an appropriate baseline minimum CPU platform for serving nodes. This option can improve the serving latency and performance without changing the underlying infrastructure, as it only involves rebuilding the TensorFlow Serving binary and selecting the CPU platform for the GKE nodes. This option can also take advantage of the CPU-only pods that are running on GKE, as it can optimize the CPU utilization and efficiency. Therefore, recompiling TensorFlow Serving using the source to support CPU-specific optimizations and instructing GKE to choose an appropriate baseline minimum CPU platform for serving nodes is the best option for this use case.


Reference:

Building TensorFlow Serving from source
Specifying a minimum CPU platform for a node pool



You built and manage a production system that is responsible for predicting sales numbers. Model accuracy is crucial, because the production model is required to keep up with market changes. Since being deployed to production, the model hasn't changed; however the accuracy of the model has steadily deteriorated.
What issue is most likely causing the steady decline in model accuracy?

  1. Poor data quality
  2. Lack of model retraining
  3. Too few layers in the model for capturing information
  4. Incorrect data split ratio during model training, evaluation, validation, and test

Answer(s): B

Explanation:

Model retraining is the process of updating an existing machine learning model with new data and parameters to improve its performance and accuracy. Model retraining is essential for maintaining the relevance and validity of the model, especially when the data or the environment changes over time. Model retraining can help to avoid or reduce the effects of model degradation, which is the phenomenon of the model's predictive performance decreasing as it is tested on new datasets within rapidly evolving environments1.
For the use case of predicting sales numbers, model accuracy is crucial, because the production model is required to keep up with market changes. Market changes can affect the demand, supply, price, and preference of the products, and thus influence the sales numbers. If the model is not retrained with new data that reflects the market changes, it may become outdated and inaccurate, and fail to capture the patterns and trends of the sales numbers. Therefore, the most likely issue that is causing the steady decline in model accuracy is the lack of model retraining. The other options are not as likely as option B, because they are not directly related to the model's ability to adapt to market changes. Option A, poor data quality, may affect the model's accuracy, but it is not a specific cause of model degradation over time. Option C, too few layers in the model for capturing information, may affect the model's complexity and expressiveness, but it is not a specific cause of model degradation over time. Option D, incorrect data split ratio during model training, evaluation, validation, and test, may affect the model's generalization and validation, but it is not a specific cause of model degradation over time. Therefore, option B, lack of model retraining, is the best answer for this question.


Reference:

Beware Steep Decline: Understanding Model Degradation In Machine Learning Models



You are an ML engineer at a large grocery retailer with stores in multiple regions. You have been asked to create an inventory prediction model. Your models features include region, location, historical demand, and seasonal popularity. You want the algorithm to learn from new inventory data on a daily basis.
Which algorithms should you use to build the model?

  1. Classification
  2. Reinforcement Learning
  3. Recurrent Neural Networks (RNN)
  4. Convolutional Neural Networks (CNN)

Answer(s): B

Explanation:

Reinforcement learning is a machine learning technique that enables an agent to learn from its own actions and feedback in an environment. Reinforcement learning does not require labeled data or explicit rules, but rather relies on trial and error and reward and punishment mechanisms to optimize the agent's behavior and achieve a goal. Reinforcement learning can be used to solve complex and dynamic problems that involve sequential decision making and adaptation to changing situations1.
For the use case of creating an inventory prediction model for a large grocery retailer with stores in multiple regions, reinforcement learning is a suitable algorithm to use. This is because the problem involves multiple factors that affect the inventory demand, such as region, location, historical demand, and seasonal popularity, and the inventory manager needs to make optimal decisions on how much and when to order, store, and distribute the products. Reinforcement learning can help the inventory manager to learn from the new inventory data on a daily basis, and adjust the inventory policy accordingly. Reinforcement learning can also handle the uncertainty and variability of the inventory demand, and balance the trade-off between overstocking and understocking2. The other options are not as suitable as option B, because they are not designed to handle sequential decision making and adaptation to changing situations. Option A, classification, is a machine learning technique that assigns a label to an input based on predefined categories. Classification can be used to predict the inventory demand for a single product or a single period, but it cannot optimize the inventory policy over multiple products and periods. Option C, recurrent neural networks (RNN), are a type of neural network that can process sequential data, such as text, speech, or time series. RNN can be used to model the temporal patterns and dependencies of the inventory demand, but they cannot learn from feedback and rewards. Option D, convolutional neural networks (CNN), are a type of neural network that can process spatial data, such as images, videos, or graphs. CNN can be used to extract features and patterns from the inventory data, but they cannot optimize the inventory policy over multiple actions and states. Therefore, option B, reinforcement learning, is the best answer for this question.


Reference:

Reinforcement learning - Wikipedia
Reinforcement Learning for Inventory Optimization



You need to train a computer vision model that predicts the type of government ID present in a given image using a GPU-powered virtual machine on Compute Engine. You use the following parameters:
· Optimizer: SGD
· Image shape 224x224
· Batch size 64
· Epochs 10
· Verbose 2
During training you encounter the following error: ResourceExhaustedError: out of Memory (oom) when allocating tensor.
What should you do?

  1. Change the optimizer
  2. Reduce the batch size
  3. Change the learning rate
  4. Reduce the image shape

Answer(s): B

Explanation:

A ResourceExhaustedError: out of memory (OOM) when allocating tensor is an error that occurs when the GPU runs out of memory while trying to allocate memory for a tensor. A tensor is a multi- dimensional array of numbers that represents the data or the parameters of a machine learning model. The size and shape of a tensor depend on various factors, such as the input data, the model architecture, the batch size, and the optimization algorithm1. For the use case of training a computer vision model that predicts the type of government ID present in a given image using a GPU-powered virtual machine on Compute Engine, the best option to resolve the error is to reduce the batch size. The batch size is a parameter that determines how many input examples are processed at a time by the model. A larger batch size can improve the model's accuracy and stability, but it also requires more memory and computation. A smaller batch size can reduce the memory and computation requirements, but it may also affect the model's performance and convergence2.
By reducing the batch size, the GPU can allocate less memory for each tensor, and avoid running out of memory. Reducing the batch size can also speed up the training process, as the GPU can process more batches in parallel. However, reducing the batch size too much may also have some drawbacks, such as increasing the noise and variance of the gradient updates, and slowing down the convergence of the model. Therefore, the optimal batch size should be chosen based on the trade-off between memory, computation, and performance3.
The other options are not as effective as option B, because they are not directly related to the memory allocation of the GPU. Option A, changing the optimizer, may affect the speed and quality of the optimization process, but it may not reduce the memory usage of the model. Option C, changing the learning rate, may affect the convergence and stability of the model, but it may not reduce the memory usage of the model. Option D, reducing the image shape, may reduce the size of the input tensor, but it may also reduce the quality and resolution of the image, and affect the model's accuracy. Therefore, option B, reducing the batch size, is the best answer for this question.


Reference:

ResourceExhaustedError: OOM when allocating tensor with shape - Stack Overflow How does batch size affect model performance and training time? - Stack Overflow How to choose an optimal batch size for training a neural network? - Stack Overflow



You have been asked to develop an input pipeline for an ML training model that processes images from disparate sources at a low latency. You discover that your input data does not fit in memory. How should you create a dataset following Google-recommended best practices?

  1. Create a tf.data.Dataset.prefetch transformation
  2. Convert the images to tf .Tensor Objects, and then run Dataset. from_tensor_slices{).
  3. Convert the images to tf .Tensor Objects, and then run tf. data. Dataset. from_tensors ().
  4. Convert the images Into TFRecords, store the images in Cloud Storage, and then use the tf. data API to read the images for training

Answer(s): D

Explanation:

An input pipeline is a way to prepare and feed data to a machine learning model for training or inference. An input pipeline typically consists of several steps, such as reading, parsing, transforming, batching, and prefetching the data. An input pipeline can improve the performance and efficiency of the model, as it can handle large and complex datasets, optimize the data processing, and reduce the latency and memory usage1.
For the use case of developing an input pipeline for an ML training model that processes images from disparate sources at a low latency, the best option is to convert the images into TFRecords, store the images in Cloud Storage, and then use the tf.data API to read the images for training. This option involves using the following components and techniques:
TFRecords: TFRecords is a binary file format that can store a sequence of data records, such as images, text, or audio. TFRecords can help to compress, serialize, and store the data efficiently, and reduce the data loading and parsing time. TFRecords can also support data sharding and interleaving, which can improve the data throughput and parallelism2.
Cloud Storage: Cloud Storage is a service that allows you to store and access data on Google Cloud. Cloud Storage can help to store and manage large and distributed datasets, such as images from different sources, and provide high availability, durability, and scalability. Cloud Storage can also integrate with other Google Cloud services, such as Compute Engine, AI Platform, and Dataflow3. tf.data API: tf.data API is a set of tools and methods that allow you to create and manipulate data pipelines in TensorFlow. tf.data API can help to read, transform, batch, and prefetch the data efficiently, and optimize the data processing for performance and memory. tf.data API can also support various data sources and formats, such as TFRecords, CSV, JSON, and images. By using these components and techniques, the input pipeline can process large datasets of images from disparate sources that do not fit in memory, and provide low latency and high performance for the ML training model. Therefore, converting the images into TFRecords, storing the images in Cloud Storage, and using the tf.data API to read the images for training is the best option for this use case.


Reference:

Build TensorFlow input pipelines | TensorFlow Core
TFRecord and tf.Example | TensorFlow Core
Cloud Storage documentation | Google Cloud
[tf.data: Build TensorFlow input pipelines | TensorFlow Core]



You are building an ML model to detect anomalies in real-time sensor dat

  1. You will use Pub/Sub to handle incoming requests. You want to store the results for analytics and visualization. How should you configure the pipeline?

  2. 1 Dataflow, 2 - Al Platform, 3 BigQuery
  3. 1 DataProc, 2 AutoML, 3 Cloud Bigtable
  4. 1 BigQuery, 2 AutoML, 3 Cloud Functions
  5. 1 BigQuery, 2 Al Platform, 3 Cloud Storage

Answer(s): A

Explanation:

Dataflow is a fully managed service for executing Apache Beam pipelines that can process streaming or batch data1.
Al Platform is a unified platform that enables you to build and run machine learning applications across Google Cloud2.

BigQuery is a serverless, highly scalable, and cost-effective cloud data warehouse designed for business agility3.
These services are suitable for building an ML model to detect anomalies in real-time sensor data, as they can handle large-scale data ingestion, preprocessing, training, serving, storage, and visualization. The other options are not as suitable because:
DataProc is a service for running Apache Spark and Apache Hadoop clusters, which are not optimized for streaming data processing4.
AutoML is a suite of machine learning products that enables developers with limited machine learning expertise to train high-quality models specific to their business needs5. However, it does not support custom models or real-time predictions.
Cloud Bigtable is a scalable, fully managed NoSQL database service for large analytical and operational workloads. However, it is not designed for ad hoc queries or interactive analysis. Cloud Functions is a serverless execution environment for building and connecting cloud services. However, it is not suitable for storing or visualizing data. Cloud Storage is a service for storing and accessing data on Google Cloud. However, it is not a data warehouse and does not support SQL queries or visualization tools.



You have a functioning end-to-end ML pipeline that involves tuning the hyperparameters of your ML model using Al Platform, and then using the best-tuned parameters for training. Hypertuning is taking longer than expected and is delaying the downstream processes. You want to speed up the tuning job without significantly compromising its effectiveness.
Which actions should you take? Choose 2 answers

  1. Decrease the number of parallel trials
  2. Decrease the range of floating-point values
  3. Set the early stopping parameter to TRUE
  4. Change the search algorithm from Bayesian search to random search.
  5. Decrease the maximum number of trials during subsequent training phases.

Answer(s): C,E

Explanation:

Hyperparameter tuning is the process of finding the optimal values for the parameters of a machine learning model that affect its performance. AI Platform provides a service for hyperparameter tuning that can run multiple trials in parallel and use different search algorithms to find the best combination of hyperparameters. However, hyperparameter tuning can be time-consuming and costly, especially if the search space is large and the model training is complex. Therefore, it is important to optimize the tuning job to reduce the time and resources required. One way to speed up the tuning job is to set the early stopping parameter to TRUE. This means that the tuning service will automatically stop trials that are unlikely to perform well based on the intermediate results. This can save time and resources by avoiding unnecessary computations for trials that are not promising. The early stopping parameter can be set in the trainingInput.hyperparameters field of the training job request1 Another way to speed up the tuning job is to decrease the maximum number of trials during subsequent training phases. This means that the tuning service will use fewer trials to refine the search space after the initial phase. This can reduce the time required for the tuning job to converge to the optimal solution. The maximum number of trials can be set in the trainingInput.hyperparameters.maxTrials field of the training job request1 The other options are not effective ways to speed up the tuning job. Decreasing the number of parallel trials will reduce the concurrency of the tuning job and increase the overall time required. Decreasing the range of floating-point values will reduce the diversity of the search space and may miss some optimal solutions. Changing the search algorithm from Bayesian search to random search will reduce the efficiency of the tuning job and may require more trials to find the best solution1 Reference: 1: Hyperparameter tuning overview






Post your Comments and Discuss Google Professional Machine Learning Engineer exam prep with other Community members:

Professional Machine Learning Engineer Exam Discussions & Posts