MLS-C01 Practice Test & Exam Questions (Page 13 )

QUESTION: 172

A data scientist has a dataset of machine part images stored in Amazon Elastic File System (Amazon EFS). The data scientist needs to use Amazon SageMaker to create and train an image classification machine learning model based on this dataset. Because of budget and time constraints, management wants the data scientist to create and train a model with the least number of steps and integration work required.
How should the data scientist meet these requirements?

Mount the EFS file system to a SageMaker notebook and run a script that copies the data to an Amazon FSx for Lustre file system. Run the SageMaker training job with the FSx for Lustre file system as the data source.
Launch a transient Amazon EMR cluster. Configure steps to mount the EFS file system and copy the data to an Amazon S3 bucket by using S3DistCp. Run the SageMaker training job with Amazon S3 as the data source.
Mount the EFS file system to an Amazon EC2 instance and use the AWS CLI to copy the data to an Amazon S3 bucket. Run the SageMaker training job with Amazon S3 as the data source.
Run a SageMaker training job with an EFS file system as the data source.

Answer(s): D

Reference:

https://aws.amazon.com/blogs/machine-learning/speed-up-training-on-amazon-sagemaker-using-amazon-efs-or-amazon-fsx-for-lustre-file-systems/

Show Answer Next Question

QUESTION: 171

A machine learning (ML) specialist wants to create a data preparation job that uses a PySpark script with complex window aggregation operations to create data for training and testing. The ML specialist needs to evaluate the impact of the number of features and the sample count on model performance.
Which approach should the ML specialist use to determine the ideal data transformations for the model?

Add an Amazon SageMaker Debugger hook to the script to capture key metrics. Run the script as an AWS Glue job.
Add an Amazon SageMaker Experiments tracker to the script to capture key metrics. Run the script as an AWS Glue job.
Add an Amazon SageMaker Debugger hook to the script to capture key parameters. Run the script as a SageMaker processing job.
Add an Amazon SageMaker Experiments tracker to the script to capture key parameters. Run the script as a SageMaker processing job.

Answer(s): D

Show Answer Next Question

QUESTION: 170

A retail company is selling products through a global online marketplace. The company wants to use machine learning (ML) to analyze customer feedback and identify specific areas for improvement. A developer has built a tool that collects customer reviews from the online marketplace and stores them in an Amazon S3 bucket.

This process yields a dataset of 40 reviews. A data scientist building the ML models must identify additional sources of data to increase the size of the dataset.

Which data sources should the data scientist use to augment the dataset of reviews? (Choose three.)

Emails exchanged by customers and the company’s customer service agents
Social media posts containing the name of the company or its products
A publicly available collection of news articles
A publicly available collection of customer reviews
Product sales revenue figures for the company
Instruction manuals for the company’s products

Answer(s): A,B,D

Explanation:

A: Emails exchanged by customers and the company's customer service agents can provide additional customer feedback and opinions about the products or services. This data can be used to improve the ML model.

B: Social media posts containing the name of the company or its products can provide additional customer feedback and opinions about the products or services, which can be used to improve the ML model.

D: A publicly available collection of customer reviews can be used to augment the existing dataset of reviews and increase the size of the dataset. This can help to improve the accuracy of the ML model.

Show Answer Next Question

QUESTION: 169

A company is building a new version of a recommendation engine. Machine learning (ML) specialists need to keep adding new data from users to improve personalized recommendations. The ML specialists gather data from the users’ interactions on the platform and from sources such as external websites and social media.

The pipeline cleans, transforms, enriches, and compresses terabytes of data daily, and this data is stored in Amazon S3. A set of Python scripts was coded to do the job and is stored in a large Amazon EC2 instance. The whole process takes more than 20 hours to finish, with each script taking at least an hour. The company wants to move the scripts out of Amazon EC2 into a more managed solution that will eliminate the need to maintain servers.

Which approach will address all of these requirements with the LEAST development effort?

Load the data into an Amazon Redshift cluster. Execute the pipeline by using SQL. Store the results in Amazon S3.
Load the data into Amazon DynamoD Convert the scripts to an AWS Lambda function. Execute the pipeline by triggering Lambda executions. Store the results in Amazon S3.
Create an AWS Glue job. Convert the scripts to PySpark. Execute the pipeline. Store the results in Amazon S3.
Create a set of individual AWS Lambda functions to execute each of the scripts. Build a step function by using the AWS Step Functions Data Science SDK. Store the results in Amazon S3.

Answer(s): C

Reference:

https://docs.aws.amazon.com/lambda/latest/dg/with-s3-example.html

Show Answer Next Question

QUESTION: 186

A data engineer is using AWS Glue to create optimized, secure datasets in Amazon S3. The data science team wants the ability to access the ETL scripts directly from Amazon SageMaker notebooks within a VPC. After this setup is complete, the data science team wants the ability to run the AWS Glue job and invoke the SageMaker training job.
Which combination of steps should the data engineer take to meet these requirements? (Choose three.)

Create a SageMaker development endpoint in the data science team's VPC.
Create an AWS Glue development endpoint in the data science team's VPC.
Create SageMaker notebooks by using the AWS Glue development endpoint.
Create SageMaker notebooks by using the SageMaker console.
Attach a decryption policy to the SageMaker notebooks.
Create an IAM policy and an IAM role for the SageMaker notebooks.

Answer(s): B,C,F

Reference:

https://docs.aws.amazon.com/glue/latest/dg/dev-endpoint-tutorial-sage.html

Show Answer Next Question

Amazon MLS-C01 Exam
AWS Certified Machine Learning - Specialty (MLS-C01) (Page 13 )

QUESTION: 172

Reference:

QUESTION: 171

QUESTION: 170

Explanation:

QUESTION: 169

Reference:

QUESTION: 186

Reference:

Join the MLS-C01 Discussion

Amazon MLS-C01 Exam AWS Certified Machine Learning - Specialty (MLS-C01) (Page 13 )

QUESTION: 172

Reference:

QUESTION: 171

QUESTION: 170

Explanation:

QUESTION: 169

Reference:

QUESTION: 186

Reference:

Join the MLS-C01 Discussion

Amazon MLS-C01 Exam
AWS Certified Machine Learning - Specialty (MLS-C01) (Page 13 )