Free DAS-C01 Exam Braindumps (page: 3)

Page 2 of 42

A data analyst is using AWS Glue to organize, cleanse, validate, and format a 200 GB dataset. The data analyst triggered the job to run with the Standard worker type. After 3 hours, the AWS Glue job status is still RUNNING. Logs from the job run show no error codes. The data analyst wants to improve the job execution time without overprovisioning.
Which actions should the data analyst take?

  1. Enable job bookmarks in AWS Glue to estimate the number of data processing units (DPUs). Based on the pro led metrics, increase the value of the executor- cores job parameter.
  2. Enable job metrics in AWS Glue to estimate the number of data processing units (DPUs). Based on the pro led metrics, increase the value of the maximum capacity job parameter.
  3. Enable job metrics in AWS Glue to estimate the number of data processing units (DPUs). Based on the pro led metrics, increase the value of the spark.yarn.executor.memoryOverhead job parameter.
  4. Enable job bookmarks in AWS Glue to estimate the number of data processing units (DPUs). Based on the pro led metrics, increase the value of the num- executors job parameter.

Answer(s): B


Reference:

https://docs.aws.amazon.com/glue/latest/dg/monitor-debug-capacity.html



A company has a business unit uploading .csv les to an Amazon S3 bucket. The company's data platform team has set up an AWS Glue crawler to do discovery, and create tables and schemas. An AWS Glue job writes processed data from the created tables to an Amazon Redshift database. The AWS Glue job handles column mapping and creating the Amazon Redshift table appropriately. When the AWS Glue job is rerun for any reason in a day, duplicate records are introduced into the Amazon Redshift table.
Which solution will update the Redshift table without duplicates when jobs are rerun?

  1. Modify the AWS Glue job to copy the rows into a staging table. Add SQL commands to replace the existing rows in the main table as postactions in the DynamicFrameWriter class.
  2. Load the previously inserted data into a MySQL database in the AWS Glue job. Perform an upsert operation in MySQL, and copy the results to the Amazon Redshift table.
  3. Use Apache Spark's DataFrame dropDuplicates() API to eliminate duplicates and then write the data to Amazon Redshift.
  4. Use the AWS Glue ResolveChoice built-in transform to select the most recent value of the column.

Answer(s): A



A streaming application is reading data from Amazon Kinesis Data Streams and immediately writing the data to an Amazon S3 bucket every 10 seconds. The application is reading data from hundreds of shards. The batch interval cannot be changed due to a separate requirement. The data is being accessed by Amazon
Athena. Users are seeing degradation in query performance as time progresses.
Which action can help improve query performance?

  1. Merge the les in Amazon S3 to form larger les.
  2. Increase the number of shards in Kinesis Data Streams.
  3. Add more memory and CPU capacity to the streaming application.
  4. Write the les to multiple S3 buckets.

Answer(s): A



A company uses Amazon OpenSearch Service (Amazon Elasticsearch Service) to store and analyze its website clickstream data. The company ingests 1 TB of data daily using Amazon Kinesis Data Firehose and stores one day's worth of data in an Amazon ES cluster. The company has very slow query performance on the Amazon ES index and occasionally sees errors from Kinesis Data Firehose when attempting to write to the index. The Amazon ES cluster has 10 nodes running a single index and 3 dedicated master nodes. Each data node has 1.5 TB of Amazon EBS storage attached and the cluster is con gured with 1,000 shards. Occasionally, JVMMemoryPressure errors are found in the cluster logs.
Which solution will improve the performance of Amazon ES?

  1. Increase the memory of the Amazon ES master nodes.
  2. Decrease the number of Amazon ES data nodes.
  3. Decrease the number of Amazon ES shards for the index.
  4. Increase the number of Amazon ES shards for the index.

Answer(s): C






Post your Comments and Discuss Amazon DAS-C01 exam with other Community members:

DAS-C01 Discussions & Posts