Free CCA175 Exam Braindumps

Problem Scenario 16 : You have been given following mysql database details as well as other info.
user=retail_dba
password=cloudera
database=retail_db
jdbc URL = jdbc:mysql://quickstart:3306/retail_db
Please accomplish below assignment.

1. Create a table in hive as below.
create table departments_hive(department_id int, department_name string);
2. Now import data from mysql table departments to this hive table. Please make sure that
data should be visible using below hive command, select" from departments_hive

  1. See the explanation for Step by Step Solution and configuration.

Answer(s): A

Explanation:

Solution :
Step 1: Create hive table as said.
hive
show tables;
create table departments_hive(department_id int, department_name string);
Step 2: The important here is, when we create a table without delimiter fields. Then default delimiter for hive is ^A (\001). Hence, while importing data we have to provide proper delimiter.
sqoop import \
-connect jdbc:mysql://quickstart:3306/retail_db \
~username=retail_dba \
-password=cloudera \
--table departments \
--hive-home /user/hive/warehouse \
-hive-import \
-hive-overwrite \
--hive-table departments_hive \
--fields-terminated-by '\001'
Step 3: Check-the data in directory.
hdfs dfs -Is /user/hive/warehouse/departments_hive
hdfs dfs -cat/user/hive/warehouse/departmentshive/part'
Check data in hive table.
Select * from departments_hive;



Problem Scenario 68 : You have given a file as below.
spark75/f ile1.txt
File contain some text. As given Below
spark75/file1.txt
Apache Hadoop is an open-source software framework written in Java for distributed storage and distributed processing of very large data sets on computer clusters built from commodity hardware. All the modules in Hadoop are designed with a fundamental assumption that hardware failures are common and should be automatically handled by the framework
The core of Apache Hadoop consists of a storage part known as Hadoop Distributed File System (HDFS) and a processing part called MapReduce. Hadoop splits files into large blocks and distributes them across nodes in a cluster. To process data, Hadoop transfers packaged code for nodes to process in parallel based on the data that needs to be processed.
his approach takes advantage of data locality nodes manipulating the data they have access to to allow the dataset to be processed faster and more efficiently than it would be in a more conventional supercomputer architecture that relies on a parallel file system where computation and data are distributed via high-speed networking
For a slightly more complicated task, lets look into splitting up sentences from our documents into word bigrams. A bigram is pair of successive tokens in some sequence. We will look at building bigrams from the sequences of words in each sentence, and then try to find the most frequently occuring ones.
The first problem is that values in each partition of our initial RDD describe lines from the file rather than sentences. Sentences may be split over multiple lines. The glom() RDD method is used to create a single entry for each document containing the list of all lines, we can then join the lines up, then resplit them into sentences using "." as the separator, using flatMap so that every object in our RDD is now a sentence.
A bigram is pair of successive tokens in some sequence. Please build bigrams from the sequences of words in each sentence, and then try to find the most frequently occuring ones.

  1. See the explanation for Step by Step Solution and configuration.

Answer(s): A

Explanation:

Solution :
Step 1: Create all three tiles in hdfs (We will do using Hue}. However, you can first create in local filesystem and then upload it to hdfs.
Step 2: The first problem is that values in each partition of our initial RDD describe lines from the file rather than sentences. Sentences may be split over multiple lines. The glom() RDD method is used to create a single entry for each document containing the list of all lines, we can then join the lines up, then resplit them into sentences using "." as the separator, using flatMap so that every object in our RDD is now a sentence. sentences = sc.textFile("spark75/file1.txt") \ .glom() \ map(lambda x: " ".join(x)) \ .flatMap(lambda x: x.spllt("."))
Step 3: Now we have isolated each sentence we can split it into a list of words and extract the word bigrams from it. Our new RDD contains tuples containing the word bigram (itself a tuple containing the first and second word) as the first value and the number 1 as the second value. bigrams = sentences.map(lambda x:x.split()) \ .flatMap(lambda x: [((x[i], x[i+1]), 1)for i in range(0, len(x)-1)])
Step 4: Finally we can apply the same reduceByKey and sort steps that we used in the wordcount example, to count up the bigrams and sort them in order of descending frequency. In reduceByKey the key is not an individual word but a bigram.
freq_bigrams = bigrams.reduceByKey(lambda x, y:x+y)\
map(lambda x:(x[1], x[0])) \
sortByKey(False)
freq_bigrams.take(10)



CORRECT TEXT
Problem Scenario 79 : You have been given MySQL DB with following details.
user=retail_dba
password=cloudera
database=retail_db
table=retail_db.orders
table=retail_db.order_items
jdbc URL = jdbc:mysql://quickstart:3306/retail_db
Columns of products table : (product_id | product categoryid | product_name | product_description | product_prtce | product_image )
Please accomplish following activities.

1. Copy "retaildb.products" table to hdfs in a directory p93_products
2. Filter out all the empty prices
3. Sort all the products based on price in both ascending as well as descending order.
4. Sort all the products based on price as well as product_id in descending order.
5. Use the below functions to do data ordering or ranking and fetch top 10 elements top()
takeOrdered() sortByKey()

  1. See the explanation for Step by Step Solution and configuration.

Answer(s): A

Explanation:

Solution :
Step 1: Import Single table .
sqoop import --connect jdbc:mysql://quickstart:3306/retail_db -username=retail_dba - password=cloudera -table=products -target-dir=p93_products -m 1 Note : Please check you dont have space between before or after '=' sign. Sqoop uses the MapReduce framework to copy data from RDBMS to hdfs
Step 2: Read the data from one of the partition, created using above command, hadoop fs -cat p93_products/part-m-00000
Step 3: Load this directory as RDD using Spark and Python (Open pyspark terminal and do following). productsRDD = sc.textFile("p93_products")
Step 4: Filter empty prices, if exists
#filter out empty prices lines
nonemptyjines = productsRDD.filter(lambda x: len(x.split(", ")[4]) > 0)
Step 5: Now sort data based on product_price in order.
sortedPriceProducts=nonempty_lines.map(lambdaline:(float(line.split(", ")[4]), line.split(", ")[2] )).sortByKey()
for line in sortedPriceProducts.collect(): print(line)
Step 6: Now sort data based on product_price in descending order.
sortedPriceProducts=nonempty_lines.map(lambda line:
(float(line.split(", ")[4]), line.split(", ")[2])).sortByKey(False) for line in sortedPriceProducts.collect(): print(line)
Step 7: Get highest price products name.
sortedPriceProducts=nonemptyJines.map(lambda line : (float(line.split(", ")[4]), line- split(, , , , , )[2]))-sortByKey(False).take(1)
print(sortedPriceProducts)
Step 8: Now sort data based on product_price as well as product_id in descending order. #Dont forget to cast string #Tuple as key ((price, id), name) sortedPriceProducts=nonemptyJines.map(lambda line : ((float(line print(sortedPriceProducts)
Step 9: Now sort data based on product_price as well as product_id in descending order, using top() function.
#Dont forget to cast string
#Tuple as key ((price, id), name)
sortedPriceProducts=nonemptyJines.map(lambda line: ((float(line.s^^ print(sortedPriceProducts)
Step 10: Now sort data based on product_price as ascending and product_id in ascending order, using takeOrdered{) function.
#Dont forget to cast string
#Tuple as key ((price, id), name) sortedPriceProducts=nonemptyJines.map(lambda line:
((float(line.split(", "}[4]}, int(line.split(", "}[0]}}, line.split(", "}[2]}}.takeOrdered(10, lambda tuple :
(tuple[0][0], tuple[0][1]))
Step 11: Now sort data based on product_price as descending and product_id in ascending order, using takeOrdered() function.
#Dont forget to cast string
#Tuple as key ((price, id}, name)
#Using minus(-) parameter can help you to make descending ordering , only for numeric value.
sortedPrlceProducts=nonemptylines.map(lambda line:
((float(line.split(", "}[4]}, int(line.split(", "}[0]}}, line.split(", "}[2]}}.takeOrdered(10, lambda tuple :
(-tuple[0][0], tuple[0][1]}}



Problem Scenario 49 : You have been given below code snippet (do a sum of values by key}, with intermediate output.
val keysWithValuesList = Array("foo=A", "foo=A", "foo=A", "foo=A", "foo=B", "bar=C", "bar=D", "bar=D")
val data = sc.parallelize(keysWithValuesl_ist}
//Create key value pairs
val kv = data.map(_.split("=")).map(v => (v(0), v(l))).cache()
val initialCount = 0;
val countByKey = kv.aggregateByKey(initialCount)(addToCounts, sumPartitionCounts)
Now define two functions (addToCounts, sumPartitionCounts) such, which will produce following results.
Output 1
countByKey.collect
res3: Array[(String, Int)] = Array((foo, 5), (bar, 3))
import scala.collection._
val initialSet = scala.collection.mutable.HashSet.empty[String]
val uniqueByKey = kv.aggregateByKey(initialSet)(addToSet, mergePartitionSets)
Now define two functions (addToSet, mergePartitionSets) such, which will produce following results.
Output 2:
uniqueByKey.collect
res4: Array[(String, scala.collection.mutable.HashSet[String])] = Array((foo, Set(B, A}}, (bar, Set(C, D}}}

  1. See the explanation for Step by Step Solution and configuration.

Answer(s): A

Explanation:

Solution :
val addToCounts = (n: Int, v: String) => n + 1
val sumPartitionCounts = (p1: Int, p2: Int} => p1 + p2
val addToSet = (s: mutable.HashSet[String], v: String) => s += v
val mergePartitionSets = (p1: mutable.HashSet[String], p2: mutable.HashSet[String]) => p1 ++= p2






Post your Comments and Discuss Cloudera CCA175 exam with other Community members:

Krishna commented on December 16, 2024
It's very helpful for exam
AUSTRALIA
upvote

nana commented on December 16, 2024
good information for practice
Anonymous
upvote

Nice commented on December 16, 2024
Nice nice nice
Anonymous
upvote

Jonas commented on December 16, 2024
Interesting
Anonymous
upvote

Gosia commented on December 16, 2024
Hi, did you have the same questions on exams?
POLAND
upvote

tom commented on December 16, 2024
it is very good
HONG KONG
upvote

sk commented on December 16, 2024
very usefull
Anonymous
upvote

harsha commented on December 16, 2024
a good way to practice
Anonymous
upvote

Rarebreed commented on December 16, 2024
These Dumps are super duper awesome. I passed my exams from these dumps on 14Th December 2024
NIGERIA
upvote

RJ commented on December 16, 2024
Preparing exam
UNITED STATES
upvote

CY commented on December 15, 2024
quite simple
HONG KONG
upvote

Kamala Swarnalatha commented on December 15, 2024
Good to use
Anonymous
upvote

kamala commented on December 15, 2024
Good to use this
Anonymous
upvote

BabeGirl commented on December 15, 2024
great stuff
Anonymous
upvote

Ousman commented on December 15, 2024
i am going to pass in this month
Anonymous
upvote

Roshan Thakur commented on December 15, 2024
Its very useful.
UNITED STATES
upvote

joe commented on December 15, 2024
dump still valid?
UNITED STATES
upvote

Priti commented on December 14, 2024
Answers seems to be correct
SINGAPORE
upvote

megha commented on December 14, 2024
pls give download file for dumps
Anonymous
upvote

Priti commented on December 14, 2024
Good questions
SINGAPORE
upvote

Priti commented on December 14, 2024
Good article
SINGAPORE
upvote

R Jeswanth commented on December 14, 2024
Hi This is Jai
AUSTRALIA
upvote

Anonymous commented on December 14, 2024
Good set or practice
Anonymous
upvote

??? commented on December 14, 2024
great collection of test questions. very effective to pass the exam
BANGLADESH
upvote

summer commented on December 13, 2024
nice questions
Anonymous
upvote

DIvesh commented on December 13, 2024
Good way to practice
JAPAN
upvote

redflame commented on December 12, 2024
great content
Anonymous
upvote

aini commented on December 12, 2024
best best best
Anonymous
upvote

Aung Naing Lin commented on December 12, 2024
good practice lesson
UNITED STATES
upvote

Mikronet commented on December 12, 2024
good pratice lessons
UNITED STATES
upvote

blaze commented on December 12, 2024
is the PDF worth it? Are these questions the same on the exam?
Anonymous
upvote

Mike Kutenda Chizinga commented on December 12, 2024
are these questions still valid
Anonymous
upvote

sas commented on December 12, 2024
good but not flexible
Anonymous
upvote

Anonymous commented on December 12, 2024
Very helpful and reference link also has been given.
Anonymous
upvote