Free DP-203 Exam Braindumps (page: 35)

Page 34 of 94

HOTSPOT (Drag and Drop is not supported)
You have an Apache Spark DataFrame named temperatures. A sample of the data is shown in the following table.

You need to produce the following table by using a Spark SQL query.


How should you complete the query? To answer, drag the appropriate values to the correct targets. Each value may be used once, more than once, or not at all. You may need to drag the split bar between panes or scroll to view content.

NOTE: Each correct selection is worth one point.
Select and Place:

  1. See Explanation section for answer.

Answer(s): A

Explanation:



Box 1: PIVOT
PIVOT rotates a table-valued expression by turning the unique values from one column in the expression into multiple columns in the output. And PIVOT runs aggregations where they're required on any remaining column values that are wanted in the final output.

Incorrect Answers:
UNPIVOT carries out the opposite operation to PIVOT by rotating columns of a table-valued expression into column values.

Box 2: CAST
If you want to convert an integer value to a DECIMAL data type in SQL Server use the CAST() function.

Example:
SELECT
CAST(12 AS DECIMAL(7,2) ) AS decimal_value;

Here is the result:
decimal_value
12.00


Reference:

https://learnsql.com/cookbook/how-to-convert-an-integer-to-a-decimal-in-sql-server/
https://docs.microsoft.com/en-us/sql/t-sql/queries/from-using-pivot-and-unpivot



You have an Azure Data Factory that contains 10 pipelines.

You need to label each pipeline with its main purpose of either ingest, transform, or load. The labels must be available for grouping and filtering when using the monitoring experience in Data Factory.

What should you add to each pipeline?

  1. a resource tag
  2. a correlation ID
  3. a run group ID
  4. an annotation

Answer(s): D

Explanation:

Annotations are additional, informative tags that you can add to specific factory resources: pipelines, datasets, linked services, and triggers. By adding annotations, you can easily filter and search for specific factory resources.


Reference:

https://www.cathrinewilhelmsen.net/annotations-user-properties-azure-data-factory/



HOTSPOT (Drag and Drop is not supported)
The following code segment is used to create an Azure Databricks cluster.


For each of the following statements, select Yes if the statement is true. Otherwise, select No.
NOTE: Each correct selection is worth one point.
Hot Area:

  1. See Explanation section for answer.

Answer(s): A

Explanation:



Box 1: Yes
A cluster mode of ‘High Concurrency’ is selected, unlike all the others which are ‘Standard’. This results in a worker type of Standard_DS13_v2.

Box 2: No
When you run a job on a new cluster, the job is treated as a data engineering (job) workload subject to the job workload pricing. When you run a job on an existing cluster, the job is treated as a data analytics (all-purpose) workload subject to all-purpose workload pricing.

Box 3: Yes
Delta Lake on Databricks allows you to configure Delta Lake based on your workload patterns.


Reference:

https://adatis.co.uk/databricks-cluster-sizing/
https://docs.microsoft.com/en-us/azure/databricks/jobs
https://docs.databricks.com/administration-guide/capacity-planning/cmbp.html
https://docs.databricks.com/delta/index.html



You are designing a statistical analysis solution that will use custom proprietary Python functions on near real-time data from Azure Event Hubs. You need to recommend which Azure service to use to perform the statistical analysis. The solution must minimize latency.
What should you recommend?

  1. Azure Synapse Analytics
  2. Azure Databricks
  3. Azure Stream Analytics
  4. Azure SQL Database

Answer(s): B






Post your Comments and Discuss Microsoft DP-203 exam with other Community members:

Exam Discussions & Posts