Free DP-203 Exam Braindumps (page: 54)

Page 54 of 94

You have an Azure subscription that contains an Azure Synapse Analytics dedicated SQL pool named Pool1. Pool1 receives new data once every 24 hours.

You have the following function.


You have the following query.


The query is executed once every 15 minutes and the @parameter value is set to the current date.
You need to minimize the time it takes for the query to return results.

Which two actions should you perform? Each correct answer presents part of the solution.
NOTE: Each correct selection is worth one point.

  1. Create an index on the avg_f column.
  2. Convert the avg_c column into a calculated column.
  3. Create an index on the sensorid column.
  4. Enable result set caching.
  5. Change the table distribution to replicate.

Answer(s): B,D

Explanation:

D: When result set caching is enabled, dedicated SQL pool automatically caches query results in the user database for repetitive use. This allows subsequent query executions to get results directly from the persisted cache so recomputation is not needed. Result set caching improves query performance and reduces compute resource usage. In addition, queries using cached results set do not use any concurrency slots and thus do not count against existing concurrency limits.

Incorrect:
Not A, not C: No joins so index not helpful.
Not E: What is a replicated table?
A replicated table has a full copy of the table accessible on each Compute node. Replicating a table removes the need to transfer data among Compute nodes before a join or aggregation. Since the table has multiple copies, replicated tables work best when the table size is less than 2 GB compressed. 2 GB is not a hard limit. If the data is static and does not change, you can replicate larger tables.


Reference:

https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/performance-tuning-result-set-caching



You need to design a solution that will process streaming data from an Azure Event Hub and output the data to Azure Data Lake Storage. The solution must ensure that analysts can interactively query the streaming data.

What should you use?

  1. Azure Stream Analytics and Azure Synapse notebooks
  2. Structured Streaming in Azure Databricks
  3. event triggers in Azure Data Factory
  4. Azure Queue storage and read-access geo-redundant storage (RA-GRS)

Answer(s): C

Explanation:

Apache Spark Structured Streaming is a fast, scalable, and fault-tolerant stream processing API. You can use it to perform analytics on your streaming data in near real-time.
With Structured Streaming, you can use SQL queries to process streaming data in the same way that you would process static data.

Azure Event Hubs is a scalable real-time data ingestion service that processes millions of data in a matter of seconds. It can receive large amounts of data from multiple sources and stream the prepared data to Azure Data Lake or Azure Blob storage.

Azure Event Hubs can be integrated with Spark Structured Streaming to perform the processing of messages in near real-time. You can query and analyze the processed data as it comes by using a Structured Streaming query and Spark SQL.


Reference:

https://k21academy.com/microsoft-azure/data-engineer/structured-streaming-with-azure-event-hubs/



You are creating an Apache Spark job in Azure Databricks that will ingest JSON-formatted data.
You need to convert a nested JSON string into a DataFrame that will contain multiple rows.
Which Spark SQL function should you use?

  1. explode
  2. filter
  3. coalesce
  4. extract

Answer(s): A

Explanation:

Convert nested JSON to a flattened DataFrame
You can to flatten nested JSON, using only $"column.*" and explode methods.

Note: Extract and flatten
Use $"column.*" and explode methods to flatten the struct and array types before displaying the flattened DataFrame.

Scala
display(DF.select($"id" as "main_id",$"name",$"batters",$"ppu",explode($"topping")) // Exploding the topping column using explode as it is an array type
.withColumn("topping_id",$"col.id") // Extracting topping_id from col using DOT form
.withColumn("topping_type",$"col.type") // Extracting topping_tytpe from col using DOT form
.drop($"col")
.select($"*",$"batters.*") // Flattened the struct type batters tto array type which is batter
.drop($"batters")
.select($"*",explode($"batter"))
.drop($"batter")
.withColumn("batter_id",$"col.id") // Extracting batter_id from col using DOT form
.withColumn("battter_type",$"col.type") // Extracting battter_type from col using DOT form
.drop($"col")
)


Reference:

https://learn.microsoft.com/en-us/azure/databricks/kb/scala/flatten-nested-columns-dynamically



DRAG DROP (Drag and Drop is not supported)
You have an Azure subscription that contains an Azure Databricks workspace. The workspace contains a notebook named Notebook1.

In Notebook1, you create an Apache Spark DataFrame named df_sales that contains the following columns:

-Customer
-SalesPerson
-Region
-Amount

You need to identify the three top performing salespersons by amount for a region named HQ.

How should you complete the query? To answer, drag the appropriate values to the correct targets. Each value may be used once, more than once, or not at all. You may need to drag the split bar between panes or scroll to view content.

NOTE: Each correct selection is worth one point.

  1. See Explanation section for answer.

Answer(s): A

Explanation:



Box 1: groupby(col(‘SalesPerson’))
Group by the SalesPerson.

Similar to SQL GROUP BY clause, PySpark groupBy() function is used to collect the identical data into groups on DataFrame and perform count, sum, avg, min, max functions on the grouped data.

Box 2: orderBy(desc(‘TotalAmount’))
Select the top 3.

You can use either sort() or orderBy() function of PySpark DataFrame to sort DataFrame by ascending or descending order based on single or multiple columns, you can also do sorting using PySpark SQL sorting functions.

Sort by Descending (DESC)
If you wanted to specify the sorting by descending order on DataFrame, you can use the desc method of the Column function, for example. From our example, let’s use desc on the state column.

df.sort(df.department.asc(),df.state.desc()).show(truncate=False)
df.sort(col("department").asc(),col("state").desc()).show(truncate=False)
df.orderBy(col("department").asc(),col("state").desc()).show(truncate=False)


Reference:

https://sparkbyexamples.com/pyspark/pyspark-groupby-explained-with-example/
https://sparkbyexamples.com/pyspark/pyspark-orderby-and-sort-explained/



Page 54 of 94



Post your Comments and Discuss Microsoft DP-203 exam with other Community members:

Ashwani commented on December 20, 2024
Nice questions
UNITED KINGDOM
upvote

Chaminda commented on November 28, 2024
great papers
Anonymous
upvote

Michal commented on October 11, 2024
I hope it will worth it
POLAND
upvote

John commented on August 30, 2024
This exam dump helped me pass my DP-203 exam.
Anonymous
upvote

Rameez commented on July 08, 2024
This is a great resource
UNITED STATES
upvote

Robinson commented on June 28, 2024
Great work and challenge to oneself before sitting for the exam
Anonymous
upvote

Robinson commented on June 27, 2024
Honestly, this is a great resource.
Anonymous
upvote

Mike Liu commented on June 24, 2024
Very useful materials
SINGAPORE
upvote

Rod commented on June 13, 2024
Very professional content and professional team. The support team is knowledgeable polite and very quick to reply and help. I am happy with my purchase.
Australia
upvote

Gaston commented on June 13, 2024
After going over this free version of the exam I decided to buy the full PDF version and the free software that comes with it. I ma very glad I did it. Now it is much easier to study. I will post about my exam result once I write it next week. Wish me luck guys.
European Union
upvote

Wilma commented on June 13, 2024
Passed my AI-102 exam with this exam dumps. The exam is very hard at least for my knowledge. I am pretty new in the industry and I want to add as many certificates as I can to my CV.
UNITED STATES
upvote

Mel commented on June 13, 2024
Well written
Anonymous
upvote

Mel commented on June 13, 2024
Perfect queries
Anonymous
upvote

Tolaram commented on June 13, 2024
I bought 2 exams with 50% discount. I already passed this exam. I hope I can pass the second one as well. The questions in the first exam was word by word from this exam dump.
INDIA
upvote

Satheesh commented on June 12, 2024
Hi Guys, Are these dumps will help now also? Are these questions still comes in the exam. Please let me know.
INDIA
upvote

Arun commented on May 30, 2024
@Neetha, Pl let me know your comments whether is questions still in exam
SINGAPORE
upvote

Neetha commented on May 25, 2024
These dumps can help right now also.did anybody try recently.pls let me know. I am going to right dp-203 in next week.
CANADA
upvote

vamsi commented on May 07, 2024
this is helping a lot
Anonymous
upvote

Jain commented on April 26, 2024
I have used 3 Microsoft study packages from this site and passed all 3 of my exams. The contract followes all the topics and scenarios of the exam.
INDIA
upvote

Saira commented on March 14, 2023
I was skeptical at first, but this exam dump helped me pass my test!
UNITED KINGDOM
upvote

Tory commented on January 11, 2023
Welcome to the wold of easy passing. LOL Gotta love these brain dumps!
CANADA
upvote

Masomba commented on June 11, 2022
I foudn about 85% to 90% of the questions in the exam. This is a valid dumps guys.
SOUTH AFRICA
upvote

Shawn commented on March 18, 2022
Just passed with 91% mark today.
UNITED STATES
upvote

Muhammed commented on March 18, 2022
The support team is very helpful. They managed to fix the issue I had with my Xengine App software becuase I am running Arabic OS.
UNITED ARAB EMIRATES
upvote

Urmila commented on March 18, 2022
I really apprecaite the 50% discount. I bouth 3 exams for half price. I already passed 1 exam. The other 2 are underway.
UNITED STATES
upvote

Lisa commented on October 07, 2021
This makes the exam like a piece of cake. Very accurate. I recommend.
UNITED STATES
upvote

Armd-Educator commented on August 30, 2021
I am officially certified now. Thanks to Braindumps-pdf website. Their questions and the Xengine Software is the best.
SOUTH AFRICA
upvote