Free DEA-C01 Exam Braindumps (page: 13)

Page 12 of 53

A company uses Amazon Athena to run SQL queries for extract, transform, and load (ETL) tasks by using Create Table As Select (CTAS). The company must use Apache Spark instead of SQL to generate analytics.
Which solution will give the company the ability to use Spark to access Athena?

  1. Athena query settings
  2. Athena workgroup
  3. Athena data source
  4. Athena query editor

Answer(s): B



A company needs to partition the Amazon S3 storage that the company uses for a data lake. The partitioning will use a path of the S3 object keys in the following format: s3://bucket/prefix/year=2023/month=01/day=01.
A data engineer must ensure that the AWS Glue Data Catalog synchronizes with the S3 storage when the company adds new partitions to the bucket.
Which solution will meet these requirements with the LEAST latency?

  1. Schedule an AWS Glue crawler to run every morning.
  2. Manually run the AWS Glue CreatePartition API twice each day.
  3. Use code that writes data to Amazon S3 to invoke the Boto3 AWS Glue create_partition API call.
  4. Run the MSCK REPAIR TABLE command from the AWS Glue console.

Answer(s): C



A media company uses software as a service (SaaS) applications to gather data by using third-party tools. The company needs to store the data in an Amazon S3 bucket. The company will use Amazon Redshift to perform analytics based on the data.
Which AWS service or feature will meet these requirements with the LEAST operational overhead?

  1. Amazon Managed Streaming for Apache Kafka (Amazon MSK)
  2. Amazon AppFlow
  3. AWS Glue Data Catalog
  4. Amazon Kinesis

Answer(s): B



A data engineer is using Amazon Athena to analyze sales data that is in Amazon S3. The data engineer writes a query to retrieve sales amounts for 2023 for several products from a table named sales_data. However, the query does not return results for all of the products that are in the sales_data table. The data engineer needs to troubleshoot the query to resolve the issue.
The data engineer's original query is as follows:
SELECT product_name, sum(sales_amount)
FROM sales_data
WHERE year = 2023
GROUP BY product_name
How should the data engineer modify the Athena query to meet these requirements?

  1. Replace sum(sales_amount) with count(*) for the aggregation.
  2. Change WHERE year = 2023 to WHERE extract(year FROM sales_data) = 2023.
  3. Add HAVING sum(sales_amount) > 0 after the GROUP BY clause.
  4. Remove the GROUP BY clause.

Answer(s): B






Post your Comments and Discuss Amazon DEA-C01 exam with other Community members:

DEA-C01 Discussions & Posts