QUESTION: 53

Which activities could be run by the data pipeline? (choose two)

Moving data from one location to another
Running Hive queries
Backing up a primary database to a replica
Creating Hive queries

Answer(s): A,B

Explanation:

An AWS data pipeline activity can be used to move data from one location to another or to run Hive queries.

Reference:

http://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-managingpipeline. html

Show Answer Next Question

QUESTION: 54

With reference to Hadoop MapReduce in Amazon EMR, which of the following best describes "a user-defined unit of processing, mapping roughly to one algorithm that manipulates the data"?

A cluster map
A multi-cluster
A cluster store
A cluster step

Answer(s): D

Explanation:

A cluster step is a user-defined unit of processing, mapping roughly to one algorithm that manipulates the data.
A step is a Hadoop MapReduce application implemented as a Java jar or a streaming program written in Java, Ruby, Perl, Python, PHP, R, or C++.
For example, to count the frequency with which words appear in a document, and output them sorted by the count, the first step would be a MapReduce application which counts the occurrences of each word, and the second step would be a MapReduce application which sorts the output from the first step based on the counts.

Reference:

https://aws.amazon.com/elasticmapreduce/faqs/

Show Answer Next Question

QUESTION: 55

Which of the statements below are true of AWS Data Pipeline activities? (choose two)

Data pipeline activities are fixed to ensure no change or script error impacts processing speed
Data pipeline activities are extensible but only up to 256 characters
When you define your pipeline, you can choose to execute it on Activation or create a schedule to execute it on a regular basis.
Data pipeline activities are extensible - so you can run your own custom scripts to support endless combinations.

Answer(s): C,D

Explanation:

Data pipeline activities are extensible - so you can run your own custom scripts to support endless combinations. When you define your pipeline, you can choose to execute it on Activation or create a schedule to execute it on a regular basis.

Reference:

http://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-managingpipeline. html

Show Answer Next Question

QUESTION: 56

Which of the following data sources are supported by AWS Data Pipeline? (Choose 3)

Access via JDBC
Amazon Redshift
Amazon RDS databases
Elasticache

Answer(s): A,B,C

Show Answer Next Question

Free Amazon AWS-Certified-Big-Data-Specialty Exam Questions (page: 15)

QUESTION: 53

Explanation:

Reference:

QUESTION: 54

Explanation:

Reference:

QUESTION: 55

Explanation:

Reference:

QUESTION: 56

AWS-Certified-Big-Data-Specialty Exam Discussions & Posts