Free CCA175 Exam Braindumps (page: 5)

Page 4 of 25

Problem Scenario 29 : Please accomplish the following exercises using HDFS command line options.

1. Create a directory in hdfs named hdfs_commands.
2. Create a file in hdfs named data.txt in hdfs_commands.
3. Now copy this data.txt file on local filesystem, however while copying file please make
sure file properties are not changed e.g. file permissions.
4. Now create a file in local directory named data_local.txt and move this file to hdfs in
hdfs_commands directory.
5. Create a file data_hdfs.txt in hdfs_commands directory and copy it to local file system.
6. Create a file in local filesystem named file1.txt and put it to hdfs

  1. See the explanation for Step by Step Solution and configuration.

Answer(s): A

Explanation:

Solution :
Step 1: Create directory
hdfs dfs -mkdir hdfs_commands
Step 2: Create a file in hdfs named data.txt in hdfs_commands. hdfs dfs -touchz hdfs_commands/data.txt
Step 3: Now copy this data.txt file on local filesystem, however while copying file please make sure file properties are not changed e.g. file permissions.
hdfs dfs -copyToLocal -p hdfs_commands/data.txt/home/cloudera/Desktop/HadoopExam
Step 4: Now create a file in local directory named data_local.txt and move this file to hdfs in hdfs_commands directory.
touch data_local.txt
hdfs dfs -moveFromLocal /home/cloudera/Desktop/HadoopExam/dataJocal.txt hdfs_commands/
Step 5: Create a file data_hdfs.txt in hdfs_commands directory and copy it to local file system.
hdfs dfs -touchz hdfscommands/data hdfs.txt
hdfs dfs -getfrdfs_commands/data_hdfs.txt /home/cloudera/Desktop/HadoopExam/
Step 6: Create a file in local filesystem named filel .txt and put it to hdfs touch filel.txt
hdfs dfs -put/home/cloudera/Desktop/HadoopExam/file1.txt hdfs_commands/



Problem Scenario 86 : In Continuation of previous question, please accomplish following activities.

1. Select Maximum, minimum, average , Standard Deviation, and total quantity.
2. Select minimum and maximum price for each product code.
3. Select Maximum, minimum, average , Standard Deviation, and total quantity for each
product code, hwoever make sure Average and Standard deviation will have maximum two
decimal values.
4. Select all the product code and average price only where product count is more than or
equal to 3.
5. Select maximum, minimum , average and total of all the products for each code. Also
produce the same across all the products.

  1. See the explanation for Step by Step Solution and configuration.

Answer(s): A

Explanation:

Solution :
Step 1: Select Maximum, minimum, average , Standard Deviation, and total quantity. val results = sqlContext.sql('.....SELECT MAX(price) AS MAX , MIN(price) AS MIN , AVG(price) AS Average, STD(price) AS STD, SUM(quantity) AS total_products FROM products......)
results. showQ
Step 2: Select minimum and maximum price for each product code. val results = sqlContext.sql(......SELECT code, MAX(price) AS Highest Price', MIN(price) AS Lowest Price'
FROM products GROUP BY code......)
results. showQ
Step 3: Select Maximum, minimum, average , Standard Deviation, and total quantity for each product code, hwoever make sure Average and Standard deviation will have maximum two decimal values.
val results = sqlContext.sql(......SELECT code, MAX(price), MIN(price), CAST(AVG(price} AS DECIMAL(7, 2)) AS Average', CAST(STD(price) AS DECIMAL(7, 2)) AS 'Std Dev\ SUM(quantity) FROM products
GROUP BY code......)
results. showQ
Step 4: Select all the product code and average price only where product count is more than or equal to 3.
val results = sqlContext.sql(......SELECT code AS Product Code', COUNTf) AS Count',
CAST(AVG(price) AS DECIMAL(7, 2)) AS Average' FROM products GROUP BY code HAVING Count >=3"M") results. showQ
Step 5: Select maximum, minimum , average and total of all the products for each code.
Also produce the same across all the products.
val results = sqlContext.sql( """SELECT
code,
MAX(price),
MIN(pnce),
CAST(AVG(price) AS DECIMAL(7, 2)) AS Average',
SUM(quantity)-
FROM products
GROUP BY code
WITH ROLLUP""" )
results. show()



Problem Scenario 9 : You have been given following mysql database details as well as other info.
user=retail_dba
password=cloudera
database=retail_db
jdbc URL = jdbc:mysql://quickstart:3306/retail_db
Please accomplish following.

1. Import departments table in a directory.
2. Again import departments table same directory (However, directory already exist hence
it should not overrride and append the results)
3. Also make sure your results fields are terminated by '|' and lines terminated by '\n\

  1. See the explanation for Step by Step Solution and configuration.

Answer(s): A

Explanation:

Solution:
Step 1: Clean the hdfs file system, if they exists clean out.
hadoop fs -rm -R departments
hadoop fs -rm -R categories
hadoop fs -rm -R products
hadoop fs -rm -R orders
hadoop fs -rm -R order_items
hadoop fs -rm -R customers
Step 2: Now import the department table as per requirement.
sqoop import \
-connect jdbc:mysql://quickstart:330G/retaiI_db \
--username=retail_dba \
-password=cloudera \
-table departments \
-target-dir=departments \
-fields-terminated-by '|' \
-lines-terminated-by '\n' \
-ml
Step 3: Check imported data.
hdfs dfs -Is departments
hdfs dfs -cat departments/part-m-00000
Step 4: Now again import data and needs to appended.
sqoop import \
-connect jdbc:mysql://quickstart:3306/retail_db \
--username=retail_dba \
-password=cloudera \
-table departments \
-target-dir departments \
-append \
-tields-terminated-by '|' \
-lines-termtnated-by '\n' \
-ml
Step 5: Again Check the results
hdfs dfs -Is departments
hdfs dfs -cat departments/part-m-00001



Problem Scenario 51 : You have been given below code snippet.
val a = sc.parallelize(List(1, 2, 1, 3), 1)
val b = a.map((_, "b"))
val c = a.map((_, "c"))
Operation_xyz
Write a correct code snippet for Operationxyz which will produce below output.
Output:
Array[(lnt, (lterable[String], lterable[String]))] = Array(
(2, (ArrayBuffer(b), ArrayBuffer(c))),
(3, (ArrayBuffer(b), ArrayBuffer(c))),
(1, (ArrayBuffer(b, b), ArrayBuffer(c, c)))
)

  1. See the explanation for Step by Step Solution and configuration.

Answer(s): A

Explanation:

Solution :
b.cogroup(c).collect
cogroup [Pair], groupWith [Pair]
A very powerful set of functions that allow grouping up to 3 key-value RDDs together using their keys.
Another example
val x = sc.parallelize(List((1, "apple"), (2, "banana"), (3, "orange"), (4, "kiwi")), 2) val y = sc.parallelize(List((5, "computer"), (1, "laptop"), (1, "desktop"), (4, "iPad")), 2)
x.cogroup(y).collect
Array[(lnt, (lterable[String], lterable[String]))] = Array( (4, (ArrayBuffer(kiwi), ArrayBuffer(iPad))),
(2, (ArrayBuffer(banana), ArrayBuffer())),
(3, (ArrayBuffer(orange), ArrayBuffer())),
(1 , (ArrayBuffer(apple), ArrayBuffer(laptop, desktop))), (5, {ArrayBuffer(), ArrayBuffer(computer))))






Post your Comments and Discuss Cloudera CCA175 exam with other Community members:

CCA175 Exam Discussions & Posts