Free DAS-C01 Exam Braindumps (page: 18)

Page 17 of 42

A university intends to use Amazon Kinesis Data Firehose to collect JSON-formatted batches of water quality readings in Amazon S3. The readings are from 50 sensors scattered across a local lake. Students will query the stored data using Amazon Athena to observe changes in a captured metric over time, such as water temperature or acidity. Interest has grown in the study, prompting the university to reconsider how data will be stored.
Which data format and partitioning choices will MOST signi cantly reduce costs? (Choose two.)

  1. Store the data in Apache Avro format using Snappy compression.
  2. Partition the data by year, month, and day.
  3. Store the data in Apache ORC format using no compression.
  4. Store the data in Apache Parquet format using Snappy compression.
  5. Partition the data by sensor, year, month, and day.

Answer(s): B,D


Reference:

https://docs.aws.amazon.com/ rehose/latest/dev/record-format-conversion.html



A healthcare company uses AWS data and analytics tools to collect, ingest, and store electronic health record (EHR) data about its patients. The raw EHR data is stored in Amazon S3 in JSON format partitioned by hour, day, and year and is updated every hour. The company wants to maintain the data catalog and metadata in an AWS Glue Data Catalog to be able to access the data using Amazon Athena or Amazon Redshift Spectrum for analytics.
When de ning tables in the Data Catalog, the company has the following requirements:
Choose the catalog table name and do not rely on the catalog table naming algorithm. Keep the table updated with new partitions loaded in the respective S3 bucket pre xes.
Which solution meets these requirements with minimal effort?

  1. Run an AWS Glue crawler that connects to one or more data stores, determines the data structures, and writes tables in the Data Catalog.
  2. Use the AWS Glue console to manually create a table in the Data Catalog and schedule an AWS Lambda function to update the table partitions hourly.
  3. Use the AWS Glue API CreateTable operation to create a table in the Data Catalog. Create an AWS Glue crawler and specify the table as the source.
  4. Create an Apache Hive catalog in Amazon EMR with the table schema de nition in Amazon S3, and update the table partition with a scheduled job. Migrate the Hive catalog to the Data Catalog.

Answer(s): C


Reference:

https://docs.aws.amazon.com/glue/latest/dg/tables-described.html



A large university has adopted a strategic goal of increasing diversity among enrolled students. The data analytics team is creating a dashboard with data visualizations to enable stakeholders to view historical trends. All access must be authenticated using Microsoft Active Directory. All data in transit and at rest must be encrypted.
Which solution meets these requirements?

  1. Amazon QuickSight Standard edition con gured to perform identity federation using SAML 2.0. and the default encryption settings.
  2. Amazon QuickSight Enterprise edition con gured to perform identity federation using SAML 2.0 and the default encryption settings.
  3. Amazon QuckSight Standard edition using AD Connector to authenticate using Active Directory. Con gure Amazon QuickSight to use customer-provided keys imported into AWS KMS.
  4. Amazon QuickSight Enterprise edition using AD Connector to authenticate using Active Directory. Con gure Amazon QuickSight to use customer-provided keys imported into AWS KMS.

Answer(s): B


Reference:

https://docs.aws.amazon.com/quicksight/latest/user/WhatsNew.html



An airline has been collecting metrics on ight activities for analytics. A recently completed proof of concept demonstrates how the company provides insights to data analysts to improve on-time departures. The proof of concept used objects in Amazon S3, which contained the metrics in .csv format, and used Amazon
Athena for querying the data. As the amount of data increases, the data analyst wants to optimize the storage solution to improve query performance.
Which options should the data analyst use to improve performance as the data lake grows? (Choose three.)

  1. Add a randomized string to the beginning of the keys in S3 to get more throughput across partitions.
  2. Use an S3 bucket in the same account as Athena.
  3. Compress the objects to reduce the data transfer I/O.
  4. Use an S3 bucket in the same Region as Athena.
  5. Preprocess the .csv data to JSON to reduce I/O by fetching only the document keys needed by the query.
  6. Preprocess the .csv data to Apache Parquet to reduce I/O by fetching only the data blocks needed for predicates.

Answer(s): C,D,F






Post your Comments and Discuss Amazon DAS-C01 exam with other Community members:

DAS-C01 Discussions & Posts