QUESTION: 21

Three teams of data analysts use Apache Hive on an Amazon EMR cluster with the EMR File System (EMRFS) to query data stored within each teams Amazon
S3 bucket. The EMR cluster has Kerberos enabled and is con gured to authenticate users from the corporate Active Directory. The data is highly sensitive, so access must be limited to the members of each team.
Which steps will satisfy the security requirements?

For the EMR cluster Amazon EC2 instances, create a service role that grants no access to Amazon S3. Create three additional IAM roles, each granting access to each team's speci c bucket. Add the additional IAM roles to the cluster's EMR role for the EC2 trust policy. Create a security con guration mapping for the additional IAM roles to Active Directory user groups for each team.
For the EMR cluster Amazon EC2 instances, create a service role that grants no access to Amazon S3. Create three additional IAM roles, each granting access to each team's speci c bucket. Add the service role for the EMR cluster EC2 instances to the trust policies for the additional IAM roles. Create a security con guration mapping for the additional IAM roles to Active Directory user groups for each team.
For the EMR cluster Amazon EC2 instances, create a service role that grants full access to Amazon S3. Create three additional IAM roles, each granting access to each team's speci c bucket. Add the service role for the EMR cluster EC2 instances to the trust polices for the additional IAM roles. Create a security con guration mapping for the additional IAM roles to Active Directory user groups for each team.
For the EMR cluster Amazon EC2 instances, create a service role that grants full access to Amazon S3. Create three additional IAM roles, each granting access to each team's speci c bucket. Add the service role for the EMR cluster EC2 instances to the trust polices for the base IAM roles. Create a security con guration mapping for the additional IAM roles to Active Directory user groups for each team.

Answer(s): B

Show Answer Next Question

QUESTION: 22

A company is planning to create a data lake in Amazon S3. The company wants to create tiered storage based on access patterns and cost objectives. The solution must include support for JDBC connections from legacy clients, metadata management that allows federation for access control, and batch-based ETL using PySpark and Scala. Operational management should be limited. Which combination of components can meet these requirements? (Choose three.)

AWS Glue Data Catalog for metadata management
Amazon EMR with Apache Spark for ETL
AWS Glue for Scala-based ETL
Amazon EMR with Apache Hive for JDBC clients
Amazon Athena for querying data in Amazon S3 using JDBC drivers
Amazon EMR with Apache Hive, using an Amazon RDS with MySQL-compatible backed metastore

Answer(s): A,C,E

Reference:

https://d1.awsstatic.com/whitepapers/Storage/data-lake-on-aws.pdf

Show Answer Next Question

QUESTION: 23

A company wants to optimize the cost of its data and analytics platform. The company is ingesting a number of .csv and JSON les in Amazon S3 from various data sources. Incoming data is expected to be 50 GB each day. The company is using Amazon Athena to query the raw data in Amazon S3 directly. Most queries aggregate data from the past 12 months, and data that is older than 5 years is infrequently queried. The typical query scans about 500 MB of data and is expected to return results in less than 1 minute. The raw data must be retained inde nitely for compliance requirements.
Which solution meets the company's requirements?

Use an AWS Glue ETL job to compress, partition, and convert the data into a columnar data format. Use Athena to query the processed dataset. Con gure a lifecycle policy to move the processed data into the Amazon S3 Standard-Infrequent Access (S3 Standard-IA) storage class 5 years after object creation. Con gure a second lifecycle policy to move the raw data into Amazon S3 Glacier for long-term archival 7 days after object creation.
Use an AWS Glue ETL job to partition and convert the data into a row-based data format. Use Athena to query the processed dataset. Con gure a lifecycle policy to move the data into the Amazon S3 Standard-Infrequent Access (S3 Standard-IA) storage class 5 years after object creation. Con gure a second lifecycle policy to move the raw data into Amazon S3 Glacier for long-term archival 7 days after object creation.
Use an AWS Glue ETL job to compress, partition, and convert the data into a columnar data format. Use Athena to query the processed dataset. Con gure a lifecycle policy to move the processed data into the Amazon S3 Standard-Infrequent Access (S3 Standard-IA) storage class 5 years after the object was last accessed. Con gure a second lifecycle policy to move the raw data into Amazon S3 Glacier for long- term archival 7 days after the last date the object was accessed.
Use an AWS Glue ETL job to partition and convert the data into a row-based data format. Use Athena to query the processed dataset. Con gure a lifecycle policy to move the data into the Amazon S3 Standard-Infrequent Access (S3 Standard-IA) storage class 5 years after the object was last accessed. Con gure a second lifecycle policy to move the raw data into Amazon S3 Glacier for long-term archival 7 days after the last date the object was accessed.

Answer(s): A

Show Answer Next Question

QUESTION: 24

An energy company collects voltage data in real time from sensors that are attached to buildings. The company wants to receive noti cations when a sequence of two voltage drops is detected within 10 minutes of a sudden voltage increase at the same building. All noti cations must be delivered as quickly as possible. The system must be highly available. The company needs a solution that will automatically scale when this monitoring feature is implemented in other cities. The noti cation system is subscribed to an Amazon Simple Noti cation Service (Amazon SNS) topic for remediation.
Which solution will meet these requirements?

Create an Amazon Managed Streaming for Apache Kafka cluster to ingest the data. Use an Apache Spark Streaming with Apache Kafka consumer API in an automatically scaled Amazon EMR cluster to process the incoming data. Use the Spark Streaming application to detect the known event sequence and send the SNS message.
Create a REST-based web service by using Amazon API Gateway in front of an AWS Lambda function. Create an Amazon RDS for PostgreSQL database with su cient Provisioned IOPS to meet current demand. Con gure the Lambda function to store incoming events in the RDS for PostgreSQL database, query the latest data to detect the known event sequence, and send the SNS message.
Create an Amazon Kinesis Data Firehose delivery stream to capture the incoming sensor data. Use an AWS Lambda transformation function to detect the known event sequence and send the SNS message.
Create an Amazon Kinesis data stream to capture the incoming sensor data. Create another stream for noti cations. Set up AWS Application Auto Scaling on both streams. Create an Amazon Kinesis Data Analytics for Java application to detect the known event sequence, and add a message to the message stream Con gure an AWS Lambda function to poll the message stream and publish to the SNS topic.

Answer(s): C

Reference:

https://aws.amazon.com/kinesis/data-streams/faqs/

Show Answer Next Question

Free Amazon DAS-C01 Exam Questions (page: 7)

QUESTION: 21

QUESTION: 22

Reference:

QUESTION: 23

QUESTION: 24

Reference:

DAS-C01 Exam Discussions & Posts