Free AWS Certified Machine Learning - Specialty Exam Braindumps (page: 44)

Page 44 of 84

A retail company uses a machine learning (ML) model for daily sales forecasting. The company’s brand manager reports that the model has provided inaccurate results for the past 3 weeks.

At the end of each day, an AWS Glue job consolidates the input data that is used for the forecasting with the actual daily sales data and the predictions of the model. The AWS Glue job stores the data in Amazon S3. The company’s ML team is using an Amazon SageMaker Studio notebook to gain an understanding about the source of the model's inaccuracies.

What should the ML team do on the SageMaker Studio notebook to visualize the model's degradation MOST accurately?

  1. Create a histogram of the daily sales over the last 3 weeks. In addition, create a histogram of the daily sales from before that period.
  2. Create a histogram of the model errors over the last 3 weeks. In addition, create a histogram of the model errors from before that period.
  3. Create a line chart with the weekly mean absolute error (MAE) of the model.
  4. Create a scatter plot of daily sales versus model error for the last 3 weeks. In addition, create a scatter plot of daily sales versus model error from before that period.

Answer(s): C


Reference:

https://machinelearningmastery.com/time-series-forecasting-performance-measures-with-python/



An ecommerce company sends a weekly email newsletter to all of its customers. Management has hired a team of writers to create additional targeted content. A data scientist needs to identify five customer segments based on age, income, and location. The customers’ current segmentation is unknown. The data scientist previously built an XGBoost model to predict the likelihood of a customer responding to an email based on age, income, and location.
Why does the XGBoost model NOT meet the current requirements, and how can this be fixed?

  1. The XGBoost model provides a true/false binary output. Apply principal component analysis (PCA) with five feature dimensions to predict a segment.
  2. The XGBoost model provides a true/false binary output. Increase the number of classes the XGBoost model predicts to five classes to predict a segment.
  3. The XGBoost model is a supervised machine learning algorithm. Train a k-Nearest-Neighbors (kNN) model with K = 5 on the same dataset to predict a segment.
  4. The XGBoost model is a supervised machine learning algorithm. Train a k-means model with K = 5 on the same dataset to predict a segment.

Answer(s): D



A global financial company is using machine learning to automate its loan approval process. The company has a dataset of customer information. The dataset contains some categorical fields, such as customer location by city and housing status. The dataset also includes financial fields in different units, such as account balances in US dollars and monthly interest in US cents.

The company’s data scientists are using a gradient boosting regression model to infer the credit score for each customer. The model has a training accuracy of 99% and a testing accuracy of 75%. The data scientists want to improve the model’s testing accuracy.

Which process will improve the testing accuracy the MOST?

  1. Use a one-hot encoder for the categorical fields in the dataset. Perform standardization on the financial fields in the dataset. Apply L1 regularization to the data.
  2. Use tokenization of the categorical fields in the dataset. Perform binning on the financial fields in the dataset. Remove the outliers in the data by using the z-score.
  3. Use a label encoder for the categorical fields in the dataset. Perform L1 regularization on the financial fields in the dataset. Apply L2 regularization to the data.
  4. Use a logarithm transformation on the categorical fields in the dataset. Perform binning on the financial fields in the dataset. Use imputation to populate missing values in the dataset.

Answer(s): A



A machine learning (ML) specialist needs to extract embedding vectors from a text series. The goal is to provide a ready-to-ingest feature space for a data scientist to develop downstream ML predictive models. The text consists of curated sentences in English. Many sentences use similar words but in different contexts.

There are questions and answers among the sentences, and the embedding space must differentiate between them.

Which options can produce the required embedding vectors that capture word context and sequential QA information? (Choose two.)

  1. Amazon SageMaker seq2seq algorithm
  2. Amazon SageMaker BlazingText algorithm in Skip-gram mode
  3. Amazon SageMaker Object2Vec algorithm
  4. Amazon SageMaker BlazingText algorithm in continuous bag-of-words (CBOW) mode
  5. Combination of the Amazon SageMaker BlazingText algorithm in Batch Skip-gram mode with a custom recurrent neural network (RNN)

Answer(s): A,C


Reference:

https://aws.amazon.com/blogs/machine-learning/create-a-word-pronunciation-sequence-tosequence-model-using-amazon-sagemaker/
https://docs.aws.amazon.com/sagemaker/latest/dg/object2vec.html



Page 44 of 84



Post your Comments and Discuss Amazon AWS Certified Machine Learning - Specialty exam with other Community members:

Perumal commented on March 01, 2024
Very useful
Anonymous
upvote

Reddy commented on December 14, 2023
these are pretty useful
Anonymous
upvote

Reddy commented on December 14, 2023
These are pretty useful
Anonymous
upvote

Nik commented on July 16, 2021
These study guides are the same as any other exam dums except you get them here for a very discounted price. Quality and formatting is good plus the Xengine App software is a good simulator tool which comes for free.
UNITED STATES
upvote