Free CCA175 Exam Braindumps (page: 11)

Page 11 of 25

Problem Scenario 3: You have been given MySQL DB with following details.
user=retail_dba
password=cloudera
database=retail_db
table=retail_db.categories
jdbc URL = jdbc:mysql://quickstart:3306/retail_db
Please accomplish following activities.

1. Import data from categories table, where category=22 (Data should be stored in
categories subset)
2. Import data from categories table, where category>22 (Data should be stored in
categories_subset_2)
3. Import data from categories table, where category between 1 and 22 (Data should be
stored in categories_subset_3)
4. While importing catagories data change the delimiter to '|' (Data should be stored in
categories_subset_S)
5. Importing data from catagories table and restrict the import to category_name,category
id columns only with delimiter as '|'
6. Add null values in the table using below SQL statement ALTER TABLE categories
modify category_department_id int(11); INSERT INTO categories values
(eO.NULL.'TESTING');
7. Importing data from catagories table (In categories_subset_17 directory) using '|'
delimiter and categoryjd between 1 and 61 and encode null values for both string and non
string columns.
8. Import entire schema retail_db in a directory categories_subset_all_tables

  1. See the explanation for Step by Step Solution and configuration.

Answer(s): A

Explanation:

Solution:

Step 1: Import Single table (Subset data} Note: Here the ' is the same you find on - key sqoop import --connect jdbc:mysql://quickstart:3306/retail_db --username=retail_dba - password=cloudera -table=categories ~warehouse-dir= categories_subset --where \'category_id\'=22 --m 1
Step 2: Check the output partition
hdfs dfs -cat categoriessubset/categories/part-m-00000
Step 3: Change the selection criteria (Subset data) sqoop import --connect jdbc:mysql://quickstart:3306/retail_db --username=retail_dba - password=cloudera -table=categories ~warehouse-dir= categories_subset_2 --where \'category_id\'\>22 -m 1
Step 4: Check the output partition
hdfs dfs -cat categories_subset_2/categories/part-m-00000
Step 5: Use between clause (Subset data)
sqoop import --connect jdbc:mysql://quickstart:3306/retail_db --username=retail_dba - password=cloudera -table=categories ~warehouse-dir=categories_subset_3 --where "\'category_id\' between 1 and 22" --m 1
Step 6: Check the output partition
hdfs dfs -cat categories_subset_3/categories/part-m-00000
Step 7: Changing the delimiter during import.
sqoop import --connect jdbc:mysql://quickstart:3306/retail_db --username=retail dba - password=cloudera -table=categories -warehouse-dir=:categories_subset_6 --where "/'categoryjd /' between 1 and 22" -fields-terminated-by='|' -m 1
Step 8: Check the.output partition
hdfs dfs -cat categories_subset_6/categories/part-m-00000
Step 9: Selecting subset columns
sqoop import --connect jdbc:mysql://quickstart:3306/retail_db --username=retail_dba - password=cloudera -table=categories --warehouse-dir=categories subset col -where "/'category id/' between 1 and 22" -fields-terminated-by=T -columns=category name, category id --m 1
Step 10: Check the output partition
hdfs dfs -cat categories_subset_col/categories/part-m-00000
Step 11: Inserting record with null values (Using mysql} ALTER TABLE categories modify category_department_id int(11); INSERT INTO categories values ^NULL/TESTING'); select" from categories;
Step 12: Encode non string null column
sqoop import --connect jdbc:mysql://quickstart:3306/retail_db --username=retail dba - password=cloudera -table=categories --warehouse-dir=categortes_subset_17 -where "\"category_id\" between 1 and 61" -fields-terminated-by=, |' --null-string-N' -null-non- string=, N' --m 1
Step 13: View the content
hdfs dfs -cat categories_subset_17/categories/part-m-00000
Step 14: Import all the tables from a schema (This step will take little time) sqoop import-all-tables -connect jdbc:mysql://quickstart:3306/retail_db -- username=retail_dba -password=cloudera -warehouse-dir=categories_si
Step 15: View the contents
hdfs dfs -Is categories_subset_all_tables
Step 16: Cleanup or back to originals.
delete from categories where categoryid in (59, 60);
ALTER TABLE categories modify category_department_id int(11) NOTNULL; ALTER TABLE categories modify category_name varchar(45) NOT NULL; desc categories;



Problem Scenario GG : You have been given below code snippet.
val a = sc.parallelize(List("dog", "tiger", "lion", "cat", "spider", "eagle"), 2)
val b = a.keyBy(_.length)
val c = sc.parallelize(List("ant", "falcon", "squid"), 2)
val d = c.keyBy(.length)
operation 1

Write a correct code snippet for operationl which will produce desired output, shown below.
Array[(lnt, String)] = Array((4, lion))

  1. See the explanation for Step by Step Solution and configuration.

Answer(s): A

Explanation:

Solution :
b.subtractByKey(d).collect
subtractByKey [Pair] : Very similar to subtract, but instead of supplying a function, the key- component of each pair will be automatically used as criterion for removing items from the first RDD.



Problem Scenario 89 : You have been given below patient data in csv format,
patientID, name, dateOfBirth, lastVisitDate
1001, Ah Teck, 1991-12-31, 2012-01-20
1002, Kumar, 2011-10-29, 2012-09-20
1003, Ali, 2011-01-30, 2012-10-21
Accomplish following activities.

1. Find all the patients whose lastVisitDate between current time and '2012-09-15'
2. Find all the patients who born in 2011
3. Find all the patients age
4. List patients whose last visited more than 60 days ago
5. Select patients 18 years old or younger

  1. See the explanation for Step by Step Solution and configuration.

Answer(s): A

Explanation:

Solution :

Step 1:
hdfs dfs -mkdir sparksql3
hdfs dfs -put patients.csv sparksql3/
Step 2: Now in spark shell
// SQLContext entry point for working with structured data val sqlContext = neworg.apache.spark.sql.SQLContext(sc) // this is used to implicitly convert an RDD to a DataFrame.
import sqlContext.impIicits._
// Import Spark SQL data types and Row.
import org.apache.spark.sql._
// load the data into a new RDD
val patients = sc.textFilef'sparksqIS/patients.csv") // Return the first element in this RDD
patients.first()
//define the schema using a case class
case class Patient(patientid: Integer, name: String, dateOfBirth:String , lastVisitDate:
String)
// create an RDD of Product objects
val patRDD = patients.map(_.split(M, M)).map(p => Patient(p(0).tolnt, p(1), p(2), p(3))) patRDD.first()
patRDD.count(}
// change RDD of Product objects to a DataFrame val patDF = patRDD.toDF() // register the DataFrame as a temp table patDF.registerTempTable("patients"} // Select data from table
val results = sqlContext.sql(......SELECT* FROM patients '.....) // display dataframe in a tabular format
results.show()
//Find all the patients whose lastVisitDate between current time and '2012-09-15' val results = sqlContext.sql(......SELECT * FROM patients WHERE TO_DATE(CAST(UNIX_TIMESTAMP(lastVisitDate, 'yyyy-MM-dd') AS TIMESTAMP)) BETWEEN '2012-09-15' AND current_timestamp() ORDER BY lastVisitDate......) results.showQ
/.Find all the patients who born in 2011
val results = sqlContext.sql(......SELECT * FROM patients WHERE YEAR(TO_DATE(CAST(UNIXJTlMESTAMP(dateOfBirth, 'yyyy-MM-dd') AS
TIMESTAMP))) = 2011 ......)
results. show()
//Find all the patients age
val results = sqlContext.sql(......SELECT name, dateOfBirth, datediff(current_date(), TO_DATE(CAST(UNIX_TIMESTAMP(dateOfBirth, 'yyyy-MM-dd') AS TlMESTAMP}}}/365 AS age
FROM patients
Mini >
results.show()
//List patients whose last visited more than 60 days ago
-- List patients whose last visited more than 60 days ago val results = sqlContext.sql(......SELECT name, lastVisitDate FROM patients WHERE datediff(current_date(), TO_DATE(CAST(UNIX_TIMESTAMP[lastVisitDate, 'yyyy-MM-dd') AS T1MESTAMP))) > 60......);
results. showQ;
-- Select patients 18 years old or younger
SELECT' FROM patients WHERE TO_DATE(CAST(UNIXJTlMESTAMP(dateOfBirth, 'yyyy-MM-dd') AS TIMESTAMP}) > DATE_SUB(current_date(), INTERVAL 18 YEAR); val results = sqlContext.sql(......SELECT' FROM patients WHERE TO_DATE(CAST(UNIX_TIMESTAMP(dateOfBirth, 'yyyy-MM--dd') AS TIMESTAMP)) > DATE_SUB(current_date(), T8*365)......);
results. showQ;
val results = sqlContext.sql(......SELECT DATE_SUB(current_date(), 18*365) FROM patients......);
results.show();



Problem Scenario 95 : You have to run your Spark application on yarn with each executor Maximum heap size to be 512MB and Number of processor cores to allocate on each executor will be 1 and Your main application required three values as input arguments V1 V2 V3.

Please replace XXX, YYY, ZZZ
./bin/spark-submit -class com.hadoopexam.MyTask --master yarn-cluster--num-executors 3 --driver-memory 512m XXX YYY lib/hadoopexam.jarZZZ

  1. See the explanation for Step by Step Solution and configuration.

Answer(s): A

Explanation:

Solution:

XXX: -executor-memory 512m YYY: -executor-cores 1
ZZZ : V1 V2 V3
Notes : spark-submit on yarn options Option Description archives Comma-separated list of archives to be extracted into the working directory of each executor. The path must be globally visible inside your cluster; see Advanced Dependency Management.
executor-cores Number of processor cores to allocate on each executor. Alternatively, you can use the spark.executor.cores property, executor-memory Maximum heap size to allocate to each executor. Alternatively, you can use the spark.executor.memory-property. num-executors Total number of YARN containers to allocate for this application. Alternatively, you can use the spark.executor.instances property. queue YARN queue to submit to. For more information, see Assigning Applications and Queries to Resource Pools. Default: default.



Page 11 of 25



Post your Comments and Discuss Cloudera CCA175 exam with other Community members:

Nice commented on December 16, 2024
Nice nice nice
Anonymous
upvote

Jonas commented on December 16, 2024
Interesting
Anonymous
upvote

Gosia commented on December 16, 2024
Hi, did you have the same questions on exams?
POLAND
upvote

tom commented on December 16, 2024
it is very good
HONG KONG
upvote

sk commented on December 16, 2024
very usefull
Anonymous
upvote

harsha commented on December 16, 2024
a good way to practice
Anonymous
upvote

Rarebreed commented on December 16, 2024
These Dumps are super duper awesome. I passed my exams from these dumps on 14Th December 2024
NIGERIA
upvote

RJ commented on December 16, 2024
Preparing exam
UNITED STATES
upvote

CY commented on December 15, 2024
quite simple
HONG KONG
upvote

Kamala Swarnalatha commented on December 15, 2024
Good to use
Anonymous
upvote

kamala commented on December 15, 2024
Good to use this
Anonymous
upvote

BabeGirl commented on December 15, 2024
great stuff
Anonymous
upvote

Ousman commented on December 15, 2024
i am going to pass in this month
Anonymous
upvote

Roshan Thakur commented on December 15, 2024
Its very useful.
UNITED STATES
upvote

joe commented on December 15, 2024
dump still valid?
UNITED STATES
upvote

Priti commented on December 14, 2024
Answers seems to be correct
SINGAPORE
upvote

megha commented on December 14, 2024
pls give download file for dumps
Anonymous
upvote

Priti commented on December 14, 2024
Good questions
SINGAPORE
upvote

Priti commented on December 14, 2024
Good article
SINGAPORE
upvote

R Jeswanth commented on December 14, 2024
Hi This is Jai
AUSTRALIA
upvote

Anonymous commented on December 14, 2024
Good set or practice
Anonymous
upvote

??? commented on December 14, 2024
great collection of test questions. very effective to pass the exam
BANGLADESH
upvote

summer commented on December 13, 2024
nice questions
Anonymous
upvote

DIvesh commented on December 13, 2024
Good way to practice
JAPAN
upvote

redflame commented on December 12, 2024
great content
Anonymous
upvote

aini commented on December 12, 2024
best best best
Anonymous
upvote

Aung Naing Lin commented on December 12, 2024
good practice lesson
UNITED STATES
upvote

Mikronet commented on December 12, 2024
good pratice lessons
UNITED STATES
upvote

blaze commented on December 12, 2024
is the PDF worth it? Are these questions the same on the exam?
Anonymous
upvote

Mike Kutenda Chizinga commented on December 12, 2024
are these questions still valid
Anonymous
upvote

sas commented on December 12, 2024
good but not flexible
Anonymous
upvote

Anonymous commented on December 12, 2024
Very helpful and reference link also has been given.
Anonymous
upvote

Anonymous commented on December 12, 2024
Preparing or certification
Anonymous
upvote

Sai commented on December 12, 2024
Preparing for the exam
AUSTRALIA
upvote