Free AWS-Certified-Big-Data-Specialty Exam Braindumps (page: 17)

Page 17 of 124

Does the EMR Hadoop input connector for Kinesis enable continuous stream processing?

  1. Only in some regions
  2. Yes
  3. No
  4. Only if the iteration process succeeds

Answer(s): C

Explanation:

The Hadoop MapReduce framework is a batch processing system. As such, it does not support continuous queries. However, there is an emerging set of Hadoop ecosystem frameworks like Twitter Storm and Spark Streaming that enable developers to build applications for continuous stream processing. A Storm connector for Kinesis is available on GitHub here and you can find a tutorial explaining how to setup Spark Streaming on EMR and run continuous queries here.
Additionally, developers can utilize the Kinesis client library to develop real-time stream processing applications.


Reference:

https://aws.amazon.com/elasticmapreduce/faqs/



In AWS Data Pipeline, what precondition in a pipeline component containing conditional statements must be true before an activity can run? (choose three)

  1. Check whether an Amazon S3 key is present
    B. Check whether source data is present before a pipeline activity attempts to copy it
  2. Check if the Hive script has compile errors in it
  3. Check whether a database table exists

Answer(s): A,D

Explanation:

The following conditional statements must be true before an AWS Data Pipeline activity will run. Check whether source data is present before a pipeline activity attempts to copy it.
Check whether a database table exists. Check whether an Amazon S3 key is present.


Reference:

http://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-managingpipeline. html



What are supported ways you can use Task Runner to process your AWS Data pipeline? (choose three)

  1. Install Task Runner on a long-running EC2 instance.
  2. install Task Runner on a computational resource that you manage.
  3. Install Task Runner on an Database migration service instance
  4. Enable AWS Data Pipeline to install Task Runner for you on resources that are launched and managed by the AWS Data Pipeline web service.

Answer(s): A,B,D

Explanation:

Task Runner enables two use cases. Enable AWS Data Pipeline to install Task Runner for you on resources that are launched and managed by the AWS Data Pipeline web service. install Task Runner on a computational resource that you manage, such as a long-running EC2 instance or an on-premise server.


Reference:

http://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-managingpipeline. html



In AWS Data Pipeline data nodes are used for (choose two)

  1. Loading data to the target
  2. Accessing data from the source
  3. Processing data transformations
  4. Storing Logs

Answer(s): A,B

Explanation:

In AWS Data Pipeline data nodes are used for Accessing data from the source, Loading data to the target


Reference:

http://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-conceptsdatanodes. html






Post your Comments and Discuss Amazon AWS-Certified-Big-Data-Specialty exam with other Community members:

AWS-Certified-Big-Data-Specialty Exam Discussions & Posts