Free AWS Certified Machine Learning - Specialty Exam Braindumps (page: 47)

Page 47 of 84

A retail company collects customer comments about its products from social media, the company website, and customer call logs. A team of data scientists and engineers wants to find common topics and determine which products the customers are referring to in their comments. The team is using natural language processing (NLP) to build a model to help with this classification.
Each product can be classified into multiple categories that the company defines. These categories are related but are not mutually exclusive. For example, if there is mention of "Sample Yogurt" in the document of customer comments, then "Sample Yogurt" should be classified as "yogurt," "snack," and "dairy product."
The team is using Amazon Comprehend to train the model and must complete the project as soon as possible.
Which functionality of Amazon Comprehend should the team use to meet these requirements?

  1. Custom classification with multi-class mode
  2. Custom classification with multi-label mode
  3. Custom entity recognition
  4. Built-in models

Answer(s): B

Explanation:

In multi-label classification, individual classes represent different categories, but these categories are somehow related and are not mutually exclusive. As a result, each document has at least one class assigned to it, but can have more. For example, a movie can simply be an action movie, or it can be an action movie, a science fiction movie, and a comedy, all at the same time.

In multi-class classification, each document can have one and only one class assigned to it. The individual classes are mutually exclusive. For example, a movie can be classed as a documentary or as science fiction, but not both at the same time.


Reference:

https://docs.aws.amazon.com/comprehend/latest/dg/prep-classifier-data-multi-label.html



A data engineer is using AWS Glue to create optimized, secure datasets in Amazon S3. The data science team wants the ability to access the ETL scripts directly from Amazon SageMaker notebooks within a VPC. After this setup is complete, the data science team wants the ability to run the AWS Glue job and invoke the SageMaker training job.
Which combination of steps should the data engineer take to meet these requirements? (Choose three.)

  1. Create a SageMaker development endpoint in the data science team's VPC.
  2. Create an AWS Glue development endpoint in the data science team's VPC.
  3. Create SageMaker notebooks by using the AWS Glue development endpoint.
  4. Create SageMaker notebooks by using the SageMaker console.
  5. Attach a decryption policy to the SageMaker notebooks.
  6. Create an IAM policy and an IAM role for the SageMaker notebooks.

Answer(s): B,C,F


Reference:

https://docs.aws.amazon.com/glue/latest/dg/dev-endpoint-tutorial-sage.html



A data engineer needs to provide a team of data scientists with the appropriate dataset to run machine learning training jobs. The data will be stored in Amazon S3. The data engineer is obtaining the data from an Amazon Redshift database and is using join queries to extract a single tabular dataset. A portion of the schema is as follows:

-TransactionTimestamp (Timestamp)
-CardName (Varchar)
-CardNo (Varchar)

The data engineer must provide the data so that any row with a CardNo value of NULL is removed. Also, the TransactionTimestamp column must be separated into a TransactionDate column and a TransactionTime column. Finally, the CardName column must be renamed to NameOnCard.

The data will be extracted on a monthly basis and will be loaded into an S3 bucket. The solution must minimize the effort that is needed to set up infrastructure for the ingestion and transformation. The solution also must be automated and must minimize the load on the Amazon Redshift cluster.

Which solution meets these requirements?

  1. Set up an Amazon EMR cluster. Create an Apache Spark job to read the data from the Amazon Redshift cluster and transform the data. Load the data into the S3 bucket. Schedule the job to run monthly.
  2. Set up an Amazon EC2 instance with a SQL client tool, such as SQL Workbench/J, to query the data from the Amazon Redshift cluster directly Export the resulting dataset into a file. Upload the file into the S3 bucket. Perform these tasks monthly.
  3. Set up an AWS Glue job that has the Amazon Redshift cluster as the source and the S3 bucket as the destination. Use the built-in transforms Filter, Map, and RenameField to perform the required transformations. Schedule the job to run monthly.
  4. Use Amazon Redshift Spectrum to run a query that writes the data directly to the S3 bucket. Create an AWS Lambda function to run the query monthly.

Answer(s): C



A machine learning (ML) specialist wants to bring a custom training algorithm to Amazon SageMaker. The ML specialist implements the algorithm in a Docker container that is supported by SageMaker.

How should the ML specialist package the Docker container so that SageMaker can launch the training correctly?

  1. Specify the server argument in the ENTRYPOINT instruction in the Dockerfile.
  2. Specify the training program in the ENTRYPOINT instruction in the Dockerfile.
  3. Include the path to the training data in the docker build command when packaging the container.
  4. Use a COPY instruction in the Dockerfile to copy the training program to the /opt/ml/train directory.

Answer(s): B



Page 47 of 84



Post your Comments and Discuss Amazon AWS Certified Machine Learning - Specialty exam with other Community members:

Perumal commented on March 01, 2024
Very useful
Anonymous
upvote

Reddy commented on December 14, 2023
these are pretty useful
Anonymous
upvote

Reddy commented on December 14, 2023
These are pretty useful
Anonymous
upvote

Nik commented on July 16, 2021
These study guides are the same as any other exam dums except you get them here for a very discounted price. Quality and formatting is good plus the Xengine App software is a good simulator tool which comes for free.
UNITED STATES
upvote