QUESTION: 2 Exam Topic: Design and implement data storage questions

You have an Azure Synapse workspace named MyWorkspace that contains an Apache Spark database named mytestdb. You run the following command in an Azure Synapse Analytics Spark pool in MyWorkspace.

CREATE TABLE mytestdb.myParquetTable(
EmployeeID int,
EmployeeName string,
EmployeeStartDate date)
USING Parquet

You then use Spark to insert a row into mytestdb.myParquetTable. The row contains the following data.

One minute later, you execute the following query from a serverless SQL pool in MyWorkspace.

SELECT EmployeeID
FROM mytestdb.dbo.myParquetTable
WHERE name = 'Alice';

What will be returned by the query?

24
an error
a null value

Answer(s): A

Explanation:

Once a database has been created by a Spark job, you can create tables in it with Spark that use Parquet as the storage format. Table names will be converted to lower case and need to be queried using the lower case name. These tables will immediately become available for querying by any of the Azure Synapse workspace Spark pools. They can also be used from any of the Spark jobs subject to permissions.

Note: For external tables, since they are synchronized to serverless SQL pool asynchronously, there will be a delay until they appear.

Reference:

https://docs.microsoft.com/en-us/azure/synapse-analytics/metadata/table

Show Answer Next Question

QUESTION: 3 Exam Topic: Design and implement data storage questions

DRAG DROP (Drag and Drop is not supported)
You have a table named SalesFact in an enterprise data warehouse in Azure Synapse Analytics. SalesFact contains sales data from the past 36 months and has the following characteristics:

-Is partitioned by month
-Contains one billion rows
-Has clustered columnstore indexes

At the beginning of each month, you need to remove data from SalesFact that is older than 36 months as quickly as possible.

Which three actions should you perform in sequence in a stored procedure? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Select and Place:

See Explanation section for answer.

Answer(s): A

Explanation:

Step 1: Create an empty table named SalesFact_work that has the same schema as SalesFact. Step 2: Switch the partition containing the stale data from SalesFact to SalesFact_Work.
SQL Data Warehouse supports partition splitting, merging, and switching. To switch partitions between two tables, you must ensure that the partitions align on their respective boundaries and that the table definitions match.

Loading data into partitions with partition switching is a convenient way stage new data in a table that is not visible to users the switch in the new data. Step 3: Drop the SalesFact_Work table.

Reference:

https://docs.microsoft.com/en-us/azure/sql-data-warehouse/sql-data-warehouse-tables-partition

Show Answer Next Question

QUESTION: 4 Exam Topic: Design and implement data storage questions

You have files and folders in Azure Data Lake Storage Gen2 for an Azure Synapse workspace as shown in the following exhibit.

You create an external table named ExtTable that has LOCATION='/topfolder/'.
When you query ExtTable by using an Azure Synapse Analytics serverless SQL pool, which files are returned?

File2.csv and File3.csv only
File1.csv and File4.csv only
File1.csv, File2.csv, File3.csv, and File4.csv
File1.csv only

Answer(s): B

Show Answer Next Question

QUESTION: 5 Exam Topic: Design and implement data storage questions

HOTSPOT (Drag and Drop is not supported)
You are planning the deployment of Azure Data Lake Storage Gen2. You have the following two reports that will access the data lake:

-Report1: Reads three columns from a file that contains 50 columns.
-Report2: Queries a single record based on a timestamp.

You need to recommend in which format to store the data in the data lake to support the reports. The solution must minimize read times. What should you recommend for each report? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.
Hot Area:

See Explanation section for answer.

Answer(s): A

Explanation:

Report1: CSV
CSV: The destination writes records as delimited data.

Report2: AVRO
AVRO supports timestamps.
Not Parquet, TSV: Not options for Azure Data Lake Storage Gen2.

Reference:

https://streamsets.com/documentation/datacollector/latest/help/datacollector/UserGuide/Destinations/ADLS-G2-D.html

Show Answer Next Question

Free Microsoft DP-203 Exam Questions (page: 4)

QUESTION: 2 Exam Topic: Design and implement data storage questions

Explanation:

Reference:

QUESTION: 3 Exam Topic: Design and implement data storage questions

Explanation:

Reference:

QUESTION: 4 Exam Topic: Design and implement data storage questions

QUESTION: 5 Exam Topic: Design and implement data storage questions

Explanation:

Reference:

DP-203 Exam Discussions & Posts