Free AWS Certified Data Engineer - Associate DEA-C01 Exam Braindumps (page: 19)

Page 19 of 39

A company receives a daily file that contains customer data in .xls format. The company stores the file in Amazon S3. The daily file is approximately 2 GB in size.
A data engineer concatenates the column in the file that contains customer first names and the column that contains customer last names. The data engineer needs to determine the number of distinct customers in the file.
Which solution will meet this requirement with the LEAST operational effort?

  1. Create and run an Apache Spark job in an AWS Glue notebook. Configure the job to read the S3 file and calculate the number of distinct customers.
  2. Create an AWS Glue crawler to create an AWS Glue Data Catalog of the S3 file. Run SQL queries from Amazon Athena to calculate the number of distinct customers.
  3. Create and run an Apache Spark job in Amazon EMR Serverless to calculate the number of distinct customers.
  4. Use AWS Glue DataBrew to create a recipe that uses the COUNT_DISTINCT aggregate function to calculate the number of distinct customers.

Answer(s): D



A healthcare company uses Amazon Kinesis Data Streams to stream real-time health data from wearable devices, hospital equipment, and patient records.
A data engineer needs to find a solution to process the streaming data. The data engineer needs to store the data in an Amazon Redshift Serverless warehouse. The solution must support near real-time analytics of the streaming data and the previous day's data.
Which solution will meet these requirements with the LEAST operational overhead?

  1. Load data into Amazon Kinesis Data Firehose. Load the data into Amazon Redshift.
  2. Use the streaming ingestion feature of Amazon Redshift.
  3. Load the data into Amazon S3. Use the COPY command to load the data into Amazon Redshift.
  4. Use the Amazon Aurora zero-ETL integration with Amazon Redshift.

Answer(s): B



A data engineer needs to use an Amazon QuickSight dashboard that is based on Amazon Athena queries on data that is stored in an Amazon S3 bucket. When the data engineer connects to the QuickSight dashboard, the data engineer receives an error message that indicates insufficient permissions.
Which factors could cause to the permissions-related errors? (Choose two.)

  1. There is no connection between QuickSight and Athena.
  2. The Athena tables are not cataloged.
  3. QuickSight does not have access to the S3 bucket.
  4. QuickSight does not have access to decrypt S3 data.
  5. There is no IAM role assigned to QuickSight.

Answer(s): C,D



A company stores datasets in JSON format and .csv format in an Amazon S3 bucket. The company has Amazon RDS for Microsoft SQL Server databases, Amazon DynamoDB tables that are in provisioned capacity mode, and an Amazon Redshift cluster. A data engineering team must develop a solution that will give data scientists the ability to query all data sources by using syntax similar to SQL.
Which solution will meet these requirements with the LEAST operational overhead?

  1. Use AWS Glue to crawl the data sources. Store metadata in the AWS Glue Data Catalog. Use Amazon Athena to query the data. Use SQL for structured data sources. Use PartiQL for data that is stored in JSON format.
  2. Use AWS Glue to crawl the data sources. Store metadata in the AWS Glue Data Catalog. Use Redshift Spectrum to query the data. Use SQL for structured data sources. Use PartiQL for data that is stored in JSON format.
  3. Use AWS Glue to crawl the data sources. Store metadata in the AWS Glue Data Catalog. Use AWS Glue jobs to transform data that is in JSON format to Apache Parquet or .csv format. Store the transformed data in an S3 bucket. Use Amazon Athena to query the original and transformed data from the S3 bucket.
  4. Use AWS Lake Formation to create a data lake. Use Lake Formation jobs to transform the data from all data sources to Apache Parquet format. Store the transformed data in an S3 bucket. Use Amazon Athena or Redshift Spectrum to query the data.

Answer(s): A



Page 19 of 39



Post your Comments and Discuss Amazon AWS Certified Data Engineer - Associate DEA-C01 exam with other Community members:

Abhishek commented on December 21, 2024
It was Nice
Anonymous
upvote

saif Ali commented on October 24, 2024
for Question no 50 The answer would be using lambda vdf as this provides automation
INDIA
upvote

Josh commented on October 09, 2024
Team, thanks for the wonderful support. This guide helped me a lot.
UNITED STATES
upvote

Ming commented on September 19, 2024
Very cool very precise. I highly recommend this study package.
UNITED STATES
upvote

Geovani commented on September 18, 2024
Very useful content and point by point explanation. And also the payment and download process was straight forward. Good job guys.
Italy
upvote