Free AWS-Certified-Big-Data-Specialty Exam Braindumps (page: 15)

Page 15 of 124

Which activities could be run by the data pipeline? (choose two)

  1. Moving data from one location to another
  2. Running Hive queries
  3. Backing up a primary database to a replica
  4. Creating Hive queries

Answer(s): A,B

Explanation:

An AWS data pipeline activity can be used to move data from one location to another or to run Hive queries.


Reference:

http://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-managingpipeline. html



With reference to Hadoop MapReduce in Amazon EMR, which of the following best describes "a user-defined unit of processing, mapping roughly to one algorithm that manipulates the data"?

  1. A cluster map
  2. A multi-cluster
  3. A cluster store
  4. A cluster step

Answer(s): D

Explanation:

A cluster step is a user-defined unit of processing, mapping roughly to one algorithm that manipulates the data.
A step is a Hadoop MapReduce application implemented as a Java jar or a streaming program written in Java, Ruby, Perl, Python, PHP, R, or C++.
For example, to count the frequency with which words appear in a document, and output them sorted by the count, the first step would be a MapReduce application which counts the occurrences of each word, and the second step would be a MapReduce application which sorts the output from the first step based on the counts.


Reference:

https://aws.amazon.com/elasticmapreduce/faqs/



Which of the statements below are true of AWS Data Pipeline activities? (choose two)

  1. Data pipeline activities are fixed to ensure no change or script error impacts processing speed
  2. Data pipeline activities are extensible but only up to 256 characters
  3. When you define your pipeline, you can choose to execute it on Activation or create a schedule to execute it on a regular basis.
  4. Data pipeline activities are extensible - so you can run your own custom scripts to support endless combinations.

Answer(s): C,D

Explanation:

Data pipeline activities are extensible - so you can run your own custom scripts to support endless combinations. When you define your pipeline, you can choose to execute it on Activation or create a schedule to execute it on a regular basis.


Reference:

http://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-managingpipeline. html



Which of the following data sources are supported by AWS Data Pipeline? (Choose 3)

  1. Access via JDBC
  2. Amazon Redshift
  3. Amazon RDS databases
  4. Elasticache

Answer(s): A,B,C






Post your Comments and Discuss Amazon AWS-Certified-Big-Data-Specialty exam with other Community members:

AWS-Certified-Big-Data-Specialty Exam Discussions & Posts