Free AWS Certified Data Engineer - Associate DEA-C01 Exam Braindumps (page: 12)

Page 12 of 39

A company uses Amazon Athena to run SQL queries for extract, transform, and load (ETL) tasks by using Create Table As Select (CTAS). The company must use Apache Spark instead of SQL to generate analytics.
Which solution will give the company the ability to use Spark to access Athena?

  1. Athena query settings
  2. Athena workgroup
  3. Athena data source
  4. Athena query editor

Answer(s): B



A company needs to partition the Amazon S3 storage that the company uses for a data lake. The partitioning will use a path of the S3 object keys in the following format: s3://bucket/prefix/year=2023/month=01/day=01.
A data engineer must ensure that the AWS Glue Data Catalog synchronizes with the S3 storage when the company adds new partitions to the bucket.
Which solution will meet these requirements with the LEAST latency?

  1. Schedule an AWS Glue crawler to run every morning.
  2. Manually run the AWS Glue CreatePartition API twice each day.
  3. Use code that writes data to Amazon S3 to invoke the Boto3 AWS Glue create_partition API call.
  4. Run the MSCK REPAIR TABLE command from the AWS Glue console.

Answer(s): C



A media company uses software as a service (SaaS) applications to gather data by using third-party tools. The company needs to store the data in an Amazon S3 bucket. The company will use Amazon Redshift to perform analytics based on the data.
Which AWS service or feature will meet these requirements with the LEAST operational overhead?

  1. Amazon Managed Streaming for Apache Kafka (Amazon MSK)
  2. Amazon AppFlow
  3. AWS Glue Data Catalog
  4. Amazon Kinesis

Answer(s): B



A data engineer is using Amazon Athena to analyze sales data that is in Amazon S3. The data engineer writes a query to retrieve sales amounts for 2023 for several products from a table named sales_data. However, the query does not return results for all of the products that are in the sales_data table. The data engineer needs to troubleshoot the query to resolve the issue.
The data engineer's original query is as follows:

-SELECT product_name, sum(sales_amount)
-FROM sales_data
-WHERE year = 2023
-GROUP BY product_name

How should the data engineer modify the Athena query to meet these requirements?

  1. Replace sum(sales_amount) with count(*) for the aggregation.
  2. Change WHERE year = 2023 to WHERE extract(year FROM sales_data) = 2023.
  3. Add HAVING sum(sales_amount) > 0 after the GROUP BY clause.
  4. Remove the GROUP BY clause.

Answer(s): B



Page 12 of 39



Post your Comments and Discuss Amazon AWS Certified Data Engineer - Associate DEA-C01 exam with other Community members:

Abhishek commented on December 21, 2024
It was Nice
Anonymous
upvote

saif Ali commented on October 24, 2024
for Question no 50 The answer would be using lambda vdf as this provides automation
INDIA
upvote

Josh commented on October 09, 2024
Team, thanks for the wonderful support. This guide helped me a lot.
UNITED STATES
upvote

Ming commented on September 19, 2024
Very cool very precise. I highly recommend this study package.
UNITED STATES
upvote

Geovani commented on September 18, 2024
Very useful content and point by point explanation. And also the payment and download process was straight forward. Good job guys.
Italy
upvote