Free DAS-C01 Exam Braindumps (page: 16)

Page 15 of 42

A company is migrating its existing on-premises ETL jobs to Amazon EMR. The code consists of a series of jobs written in Java. The company needs to reduce overhead for the system administrators without changing the underlying code. Due to the sensitivity of the data, compliance requires that the company use root device volume encryption on all nodes in the cluster. Corporate standards require that environments be provisioned though AWS CloudFormation when possible.
Which solution satis es these requirements?

  1. Install open-source Hadoop on Amazon EC2 instances with encrypted root device volumes. Con gure the cluster in the CloudFormation template.
  2. Use a CloudFormation template to launch an EMR cluster. In the con guration section of the cluster, de ne a bootstrap action to enable TLS.
  3. Create a custom AMI with encrypted root device volumes. Con gure Amazon EMR to use the custom AMI using the CustomAmild property in the CloudFormation template.
  4. Use a CloudFormation template to launch an EMR cluster. In the con guration section of the cluster, de ne a bootstrap action to encrypt the root device volume of every node.

Answer(s): C


Reference:

https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-custom-ami.html



A transportation company uses IoT sensors attached to trucks to collect vehicle data for its global delivery eet. The company currently sends the sensor data in small .csv les to Amazon S3. The les are then loaded into a 10-node Amazon Redshift cluster with two slices per node and queried using both Amazon Athena and Amazon Redshift. The company wants to optimize the les to reduce the cost of querying and also improve the speed of data loading into the Amazon
Redshift cluster.
Which solution meets these requirements?

  1. Use AWS Glue to convert all the les from .csv to a single large Apache Parquet le. COPY the le into Amazon Redshift and query the le with Athena from Amazon S3.
  2. Use Amazon EMR to convert each .csv le to Apache Avro. COPY the les into Amazon Redshift and query the le with Athena from Amazon S3.
  3. Use AWS Glue to convert the les from .csv to a single large Apache ORC le. COPY the le into Amazon Redshift and query the le with Athena from Amazon S3.
  4. Use AWS Glue to convert the les from .csv to Apache Parquet to create 20 Parquet les. COPY the les into Amazon Redshift and query the les with Athena from Amazon S3.

Answer(s): D



An online retail company with millions of users around the globe wants to improve its ecommerce analytics capabilities. Currently, clickstream data is uploaded directly to Amazon S3 as compressed les. Several times each day, an application running on Amazon EC2 processes the data and makes search options and reports available for visualization by editors and marketers. The company wants to make website clicks and aggregated data available to editors and marketers in minutes to enable them to connect with users more effectively. Which options will help meet these requirements in the MOST e cient way? (Choose two.)

  1. Use Amazon Kinesis Data Firehose to upload compressed and batched clickstream records to Amazon OpenSearch Service (Amazon Elasticsearch Service).
  2. Upload clickstream records to Amazon S3 as compressed les. Then use AWS Lambda to send data to Amazon OpenSearch Service (Amazon Elasticsearch Service) from Amazon S3.
  3. Use Amazon OpenSearch Service (Amazon Elasticsearch Service) deployed on Amazon EC2 to aggregate, lter, and process the data.
    Refresh content performance dashboards in near-real time.
  4. Use OpenSearch Dashboards (Kibana) to aggregate, lter, and visualize the data stored in Amazon OpenSearch Service (Amazon Elasticsearch Service). Refresh content performance dashboards in near-real time.
  5. Upload clickstream records from Amazon S3 to Amazon Kinesis Data Streams and use a Kinesis Data Streams consumer to send records to Amazon OpenSearch Service (Amazon Elasticsearch Service).

Answer(s): A,D



A company is streaming its high-volume billing data (100 MBps) to Amazon Kinesis Data Streams. A data analyst partitioned the data on account_id to ensure that all records belonging to an account go to the same Kinesis shard and order is maintained. While building a custom consumer using the Kinesis Java SDK, the data analyst notices that, sometimes, the messages arrive out of order for account_id. Upon further investigation, the data analyst discovers the messages that are out of order seem to be arriving from different shards for the same account_id and are seen when a stream resize runs.
What is an explanation for this behavior and what is the solution?

  1. There are multiple shards in a stream and order needs to be maintained in the shard. The data analyst needs to make sure there is only a single shard in the stream and no stream resize runs.
  2. The hash key generation process for the records is not working correctly. The data analyst should generate an explicit hash key on the producer side so the records are directed to the appropriate shard accurately.
  3. The records are not being received by Kinesis Data Streams in order. The producer should use the PutRecords API call instead of the PutRecord API call with the SequenceNumberForOrdering parameter.
  4. The consumer is not processing the parent shard completely before processing the child shards after a stream resize. The data analyst should process the parent shard completely rst before processing the child shards.

Answer(s): D






Post your Comments and Discuss Amazon DAS-C01 exam with other Community members:

DAS-C01 Discussions & Posts