Free AWS Certified Data Engineer - Associate DEA-C01 Exam Braindumps (page: 15)

Page 15 of 39

A company stores details about transactions in an Amazon S3 bucket. The company wants to log all writes to the S3 bucket into another S3 bucket that is in the same AWS Region.
Which solution will meet this requirement with the LEAST operational effort?

  1. Configure an S3 Event Notifications rule for all activities on the transactions S3 bucket to invoke an AWS Lambda function. Program the Lambda function to write the event to Amazon Kinesis Data Firehose. Configure Kinesis Data Firehose to write the event to the logs S3 bucket.
  2. Create a trail of management events in AWS CloudTraiL. Configure the trail to receive data from the transactions S3 bucket. Specify an empty prefix and write-only events. Specify the logs S3 bucket as the destination bucket.
  3. Configure an S3 Event Notifications rule for all activities on the transactions S3 bucket to invoke an AWS Lambda function. Program the Lambda function to write the events to the logs S3 bucket.
  4. Create a trail of data events in AWS CloudTraiL. Configure the trail to receive data from the transactions S3 bucket. Specify an empty prefix and write-only events. Specify the logs S3 bucket as the destination bucket.

Answer(s): D



A data engineer needs to maintain a central metadata repository that users access through Amazon EMR and Amazon Athena queries. The repository needs to provide the schema and properties of many tables. Some of the metadata is stored in Apache Hive. The data engineer needs to import the metadata from Hive into the central metadata repository.
Which solution will meet these requirements with the LEAST development effort?

  1. Use Amazon EMR and Apache Ranger.
  2. Use a Hive metastore on an EMR cluster.
  3. Use the AWS Glue Data Catalog.
  4. Use a metastore on an Amazon RDS for MySQL DB instance.

Answer(s): C



A company needs to build a data lake in AWS. The company must provide row-level data access and column-level data access to specific teams. The teams will access the data by using Amazon Athena, Amazon Redshift Spectrum, and Apache Hive from Amazon EMR.
Which solution will meet these requirements with the LEAST operational overhead?

  1. Use Amazon S3 for data lake storage. Use S3 access policies to restrict data access by rows and columns. Provide data access through Amazon S3.
  2. Use Amazon S3 for data lake storage. Use Apache Ranger through Amazon EMR to restrict data access by rows and columns. Provide data access by using Apache Pig.
  3. Use Amazon Redshift for data lake storage. Use Redshift security policies to restrict data access by rows and columns. Provide data access by using Apache Spark and Amazon Athena federated queries.
  4. Use Amazon S3 for data lake storage. Use AWS Lake Formation to restrict data access by rows and columns. Provide data access through AWS Lake Formation.

Answer(s): D



An airline company is collecting metrics about flight activities for analytics. The company is conducting a proof of concept (POC) test to show how analytics can provide insights that the company can use to increase on-time departures.
The POC test uses objects in Amazon S3 that contain the metrics in .csv format. The POC test uses Amazon Athena to query the data. The data is partitioned in the S3 bucket by date.
As the amount of data increases, the company wants to optimize the storage solution to improve query performance.
Which combination of solutions will meet these requirements? (Choose two.)

  1. Add a randomized string to the beginning of the keys in Amazon S3 to get more throughput across partitions.
  2. Use an S3 bucket that is in the same account that uses Athena to query the data.
  3. Use an S3 bucket that is in the same AWS Region where the company runs Athena queries.
  4. Preprocess the .csv data to JSON format by fetching only the document keys that the query requires.
  5. Preprocess the .csv data to Apache Parquet format by fetching only the data blocks that are needed for predicates.

Answer(s): C,E



Page 15 of 39



Post your Comments and Discuss Amazon AWS Certified Data Engineer - Associate DEA-C01 exam with other Community members:

Abhishek commented on December 21, 2024
It was Nice
Anonymous
upvote

saif Ali commented on October 24, 2024
for Question no 50 The answer would be using lambda vdf as this provides automation
INDIA
upvote

Josh commented on October 09, 2024
Team, thanks for the wonderful support. This guide helped me a lot.
UNITED STATES
upvote

Ming commented on September 19, 2024
Very cool very precise. I highly recommend this study package.
UNITED STATES
upvote

Geovani commented on September 18, 2024
Very useful content and point by point explanation. And also the payment and download process was straight forward. Good job guys.
Italy
upvote