Amazon AWS Certified Machine Learning - Specialty Exam
AWS Certified Machine Learning - Specialty (MLS-C01) (Page 2 )

Updated On: 1-Feb-2026

A large mobile network operating company is building a machine learning model to predict customers who are likely to unsubscribe from the service. The company plans to offer an incentive for these customers as the cost of churn is far greater than the cost of the incentive.

The model produces the following confusion matrix after evaluating on a test dataset of 100 customers:



Based on the model evaluation results, why is this a viable model for production?

  1. The model is 86% accurate and the cost incurred by the company as a result of false negatives is less than the false positives.
  2. The precision of the model is 86%, which is less than the accuracy of the model.
  3. The model is 86% accurate and the cost incurred by the company as a result of false positives is less than the false negatives.
  4. The precision of the model is 86%, which is greater than the accuracy of the model.

Answer(s): C



A bank wants to launch a low-rate credit promotion campaign. The bank must identify which customers to target with the promotion and wants to make sure that each customer's full credit history is considered when an approval or denial decision is made.

The bank's data science team used the XGBoost algorithm to train a classification model based on account transaction features. The data science team deployed the model by using the Amazon SageMaker model hosting service. The accuracy of the model is sufficient, but the data science team wants to be able to explain why the model denies the promotion to some customers.

What should the data science team do to meet this requirement in the MOST operationally efficient manner?

  1. Create a SageMaker notebook instance. Upload the model artifact to the notebook. Use the plot_importance() method in the Python XGBoost interface to create a feature importance chart for the individual predictions.
  2. Retrain the model by using SageMaker Debugger. Configure Debugger to calculate and collect Shapley values. Create a chart that shows features and SHapley. Additive explanations (SHAP) values to explain how the features affect the model outcomes.
  3. Set up and run an explainability job powered by SageMaker Clarify to analyze the individual customer data, using the training data as a baseline. Create a chart that shows features and SHapley Additive explanations (SHAP) values to explain how the features affect the model outcomes.
  4. Use SageMaker Model Monitor to create Shapley values that help explain model behavior. Store the Shapley values in Amazon S3. Create a chart that shows features and SHapley Additive explanations (SHAP) values to explain how the features affect the model outcomes.

Answer(s): C



A data scientist is training a large PyTorch model by using Amazon SageMaker. It takes 10 hours on average to train the model on GPU instances. The data scientist suspects that training is not converging and that resource utilization is not optimal.

What should the data scientist do to identify and address training issues with the LEAST development effort?

  1. Use CPU utilization metrics that are captured in Amazon CloudWatch. Configure a CloudWatch alarm to stop the training job early if low CPU utilization occurs.
  2. Use high-resolution custom metrics that are captured in Amazon CloudWatch. Configure an AWS Lambda function to analyze the metrics and to stop the training job early if issues are detected.
  3. Use the SageMaker Debugger vanishing_gradient and LowGPUUtilization built-in rules to detect issues and to launch the StopTrainingJob action if issues are detected.
  4. Use the SageMaker Debugger confusion and feature_importance_overweight built-in rules to detect issues and to launch the StopTrainingJob action if issues are detected.

Answer(s): C


Reference:

https://docs.aws.amazon.com/sagemaker/latest/dg/debugger-built-in-rules.html



A data scientist at a food production company wants to use an Amazon SageMaker built-in model to classify different vegetables. The current dataset has many features. The company wants to save on memory costs when the data scientist trains and deploys the model. The company also wants to be able to find similar data points for each test data point.

Which algorithm will meet these requirements?

  1. K-nearest neighbors (k-NN) with dimension reduction
  2. Linear learner with early stopping
  3. K-means
  4. Principal component analysis (PCA) with the algorithm mode set to random

Answer(s): A



A retail company wants to create a system that can predict sales based on the price of an item. A machine learning (ML) engineer built an initial linear model that resulted in the following residual plot:



Which actions should the ML engineer take to improve the accuracy of the predictions in the next phase of model building? (Choose three.)

  1. Downsample the data uniformly to reduce the amount of data.
  2. Create two different models for different sections of the data.
  3. Downsample the data in sections where Price < 50.
  4. Offset the input data by a constant value where Price > 50.
  5. Examine the input data, and apply non-linear data transformations where appropriate.
  6. Use a non-linear model instead of a linear model.

Answer(s): B,E,F



Viewing page 2 of 68
Viewing questions 6 - 10 out of 370 questions



Post your Comments and Discuss Amazon AWS Certified Machine Learning - Specialty exam prep with other Community members:

Join the AWS Certified Machine Learning - Specialty Discussion