Amazon MLS-C01 Exam
AWS Certified Machine Learning - Specialty (MLS-C01) (Page 9 )

Updated On: 1-Feb-2026

An online advertising company is developing a linear model to predict the bid price of advertisements in real time with low-latency predictions. A data scientist has trained the linear model by using many features, but the model is overfitting the training dataset. The data scientist needs to prevent overfitting and must reduce the number of features.

Which solution will meet these requirements?

  1. Retrain the model with L1 regularization applied.
  2. Retrain the model with L2 regularization applied.
  3. Retrain the model with dropout regularization applied.
  4. Retrain the model by using more data.

Answer(s): A



A machine learning (ML) specialist is developing a deep learning sentiment analysis model that is based on data from movie reviews. After the ML specialist trains the model and reviews the model results on the validation set, the ML specialist discovers that the model is overfitting.

Which solutions will MOST improve the model generalization and reduce overfitting? (Choose three.)

  1. Shuffle the dataset with a different seed.
  2. Decrease the learning rate.
  3. Increase the number of layers in the network.
  4. Add L1 regularization and L2 regularization.
  5. Add dropout.
  6. Decrease the number of layers in the network.

Answer(s): D,E,F



An ecommerce company is collecting structured data and unstructured data from its website, mobile apps, and IoT devices. The data is stored in several databases and Amazon S3 buckets. The company is implementing a scalable repository to store structured data and unstructured data. The company must implement a solution that provides a central data catalog, self-service access to the data, and granular data access policies and encryption to protect the data.

Which combination of actions will meet these requirements with the LEAST amount of setup? (Choose three.)

  1. Identify the existing data in the databases and S3 buckets. Link the data to AWS Lake Formation.
  2. Identify the existing data in the databases and S3 buckets. Link the data to AWS Glue.
  3. Run AWS Glue crawlers on the linked data sources to create a central data catalog.
  4. Apply granular access policies by using AWS Identity and Access Management (1AM). Configure server-side encryption on each data source.
  5. Apply granular access policies and encryption by using AWS Lake Formation.
  6. Apply granular access policies and encryption by using AWS Glue.

Answer(s): A,C,E



A retail company wants to build a recommendation system for the company's website. The system needs to provide recommendations for existing users and needs to base those recommendations on each user's past browsing history. The system also must filter out any items that the user previously purchased.

Which solution will meet these requirements with the LEAST development effort?

  1. Train a model by using a user-based collaborative filtering algorithm on Amazon SageMaker. Host the model on a SageMaker real-time endpoint. Configure an Amazon API Gateway API and an AWS Lambda function to handle real-time inference requests that the web application sends. Exclude the items that the user previously purchased from the results before sending the results back to the web application.
  2. Use an Amazon Personalize PERSONALIZED_RANKING recipe to train a model. Create a real-time filter to exclude items that the user previously purchased. Create and deploy a campaign on Amazon Personalize. Use the GetPersonalizedRanking API operation to get the real-time recommendations.
  3. Use an Amazon Personalize USER_PERSONALIZATION recipe to train a model. Create a real-time filter to exclude items that the user previously purchased. Create and deploy a campaign on Amazon Personalize. Use the GetRecommendations API operation to get the real-time recommendations.
  4. Train a neural collaborative filtering model on Amazon SageMaker by using GPU instances. Host the model on a SageMaker real-time endpoint. Configure an Amazon API Gateway API and an AWS Lambda function to handle real-time inference requests that the web application sends. Exclude the items that the user previously purchased from the results before sending the results back to the web application.

Answer(s): C



A finance company needs to forecast the price of a commodity. The company has compiled a dataset of historical daily prices. A data scientist must train various forecasting models on 80% of the dataset and must validate the efficacy of those models on the remaining 20% of the dataset.

How should the data scientist split the dataset into a training dataset and a validation dataset to compare model performance?

  1. Pick a date so that 80% of the data points precede the date. Assign that group of data points as the training dataset. Assign all the remaining data points to the validation dataset.
  2. Pick a date so that 80% of the data points occur after the date. Assign that group of data points as the training dataset. Assign all the remaining data points to the validation dataset.
  3. Starting from the earliest date in the dataset, pick eight data points for the training dataset and two data points for the validation dataset. Repeat this stratified sampling until no data points remain.
  4. Sample data points randomly without replacement so that 80% of the data points are in the training dataset. Assign all the remaining data points to the validation dataset.

Answer(s): A


Reference:

https://towardsdatascience.com/time-series-from-scratch-train-test-splits-and-evaluation-metrics-4fd654de1b37



Viewing page 9 of 68
Viewing questions 41 - 45 out of 370 questions



Post your Comments and Discuss Amazon MLS-C01 exam prep with other Community members:

Join the MLS-C01 Discussion