Free AWS Certified Machine Learning - Specialty Exam Braindumps (page: 31)

Page 31 of 84

A data scientist uses an Amazon SageMaker notebook instance to conduct data exploration and analysis. This requires certain Python packages that are not natively available on Amazon SageMaker to be installed on the notebook instance.

How can a machine learning specialist ensure that required packages are automatically available on the notebook instance for the data scientist to use?

  1. Install AWS Systems Manager Agent on the underlying Amazon EC2 instance and use Systems Manager Automation to execute the package installation commands.
  2. Create a Jupyter notebook file (.ipynb) with cells containing the package installation commands to execute and place the file under the /etc/init directory of each Amazon SageMaker notebook instance.
  3. Use the conda package manager from within the Jupyter notebook console to apply the necessary conda packages to the default kernel of the notebook.
  4. Create an Amazon SageMaker lifecycle configuration with package installation commands and assign the lifecycle configuration to the notebook instance.

Answer(s): D


Reference:

https://docs.aws.amazon.com/sagemaker/latest/dg/nbi-add-external.html



A data scientist needs to identify fraudulent user accounts for a company's ecommerce platform. The company wants the ability to determine if a newly created account is associated with a previously known fraudulent user. The data scientist is using AWS Glue to cleanse the company's application logs during ingestion.

Which strategy will allow the data scientist to identify fraudulent accounts?

  1. Execute the built-in FindDuplicates Amazon Athena query.
  2. Create a FindMatches machine learning transform in AWS Glue.
  3. Create an AWS Glue crawler to infer duplicate accounts in the source data.
  4. Search for duplicate accounts in the AWS Glue Data Catalog.

Answer(s): B


Reference:

https://docs.aws.amazon.com/glue/latest/dg/machine-learning.html



A Data Scientist is developing a machine learning model to classify whether a financial transaction is fraudulent. The labeled data available for training consists of 100,000 non-fraudulent observations and 1,000 fraudulent observations.

The Data Scientist applies the XGBoost algorithm to the data, resulting in the following confusion matrix when the trained model is applied to a previously unseen validation dataset. The accuracy of the model is 99.1%, but the Data Scientist needs to reduce the number of false negatives.


Which combination of steps should the Data Scientist take to reduce the number of false negative predictions by the model? (Choose two.)

  1. Change the XGBoost eval_metric parameter to optimize based on Root Mean Square Error (RMSE).
  2. Increase the XGBoost scale_pos_weight parameter to adjust the balance of positive and negative weights.
  3. Increase the XGBoost max_depth parameter because the model is currently underfitting the data.
  4. Change the XGBoost eval_metric parameter to optimize based on Area Under the ROC Curve (AUC).
  5. Decrease the XGBoost max_depth parameter because the model is currently overfitting the data.

Answer(s): B,D



A data scientist has developed a machine learning translation model for English to Japanese by using Amazon SageMaker's built-in seq2seq algorithm with 500,000 aligned sentence pairs. While testing with sample sentences, the data scientist finds that the translation quality is reasonable for an example as short as five words. However, the quality becomes unacceptable if the sentence is 100 words long.

Which action will resolve the problem?

  1. Change preprocessing to use n-grams.
  2. Add more nodes to the recurrent neural network (RNN) than the largest sentence's word count.
  3. Adjust hyperparameters related to the attention mechanism.
  4. Choose a different weight initialization type.

Answer(s): C

Explanation:

Attention mechanism. The disadvantage of an encoder-decoder framework is that model performance decreases as and when the length of the source sequence increases because of the limit of how much information the fixed-length encoded feature vector can contain. To tackle this problem, in 2015, Bahdanau et al. proposed the attention mechanism. In an attention mechanism, the decoder tries to find the location in the encoder sequence where the most important information could be located and uses that information and previously decoded words to predict the next token in the sequence.


Reference:

https://docs.aws.amazon.com/sagemaker/latest/dg/seq-2-seq- howitworks.html



Page 31 of 84



Post your Comments and Discuss Amazon AWS Certified Machine Learning - Specialty exam with other Community members:

Perumal commented on March 01, 2024
Very useful
Anonymous
upvote

Reddy commented on December 14, 2023
these are pretty useful
Anonymous
upvote

Reddy commented on December 14, 2023
These are pretty useful
Anonymous
upvote

Nik commented on July 16, 2021
These study guides are the same as any other exam dums except you get them here for a very discounted price. Quality and formatting is good plus the Xengine App software is a good simulator tool which comes for free.
UNITED STATES
upvote