QUESTION: 30 Exam Topic: Design and implement data storage questions

You are designing a financial transactions table in an Azure Synapse Analytics dedicated SQL pool. The table will have a clustered columnstore index and will include the following columns:
-TransactionType: 40 million rows per transaction type
-CustomerSegment: 4 million per customer segment
-TransactionMonth: 65 million rows per month
-AccountType: 500 million per account type

You have the following query requirements:
-Analysts will most commonly analyze transactions for a given month.
-Transactions analysis will typically summarize transactions by transaction type, customer segment, and/or account type

You need to recommend a partition strategy for the table to minimize query times. On which column should you recommend partitioning the table?

CustomerSegment
AccountType
TransactionType
TransactionMonth

Answer(s): D

Explanation:

For optimal compression and performance of clustered columnstore tables, a minimum of 1 million rows per distribution and partition is needed. Before partitions are created, dedicated SQL pool already divides each table into 60 distributed databases.

Example: Any partitioning added to a table is in addition to the distributions created behind the scenes. Using this example, if the sales fact table contained 36 monthly partitions, and given that a dedicated SQL pool has 60 distributions, then the sales fact table should contain 60 million rows per month, or 2.1 billion rows when all months are populated. If a table contains fewer than the recommended minimum number of rows per partition, consider using fewer partitions in order to increase the number of rows per partition.

Show Answer Next Question

QUESTION: 31 Exam Topic: Design and implement data storage questions

HOTSPOT (Drag and Drop is not supported)
You have an Azure Data Lake Storage Gen2 account named account1 that stores logs as shown in the following table.

You do not expect that the logs will be accessed during the retention periods.
You need to recommend a solution for account1 that meets the following requirements:
-Automatically deletes the logs at the end of each retention period
-Minimizes storage costs

What should you include in the recommendation? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.
Hot Area:

See Explanation section for answer.

Answer(s): A

Explanation:

Box 1: Store the infrastructure logs in the Cool access tier and the application logs in the Archive access tier For infrastructure logs: Cool tier - An online tier optimized for storing data that is infrequently accessed or modified. Data in the cool tier should be stored for a minimum of 30 days. The cool tier has lower storage costs and higher access costs compared to the hot tier.
For application logs: Archive tier - An offline tier optimized for storing data that is rarely accessed, and that has flexible latency requirements, on the order of hours. Data in the archive tier should be stored for a minimum of 180 days.

Box 2: Azure Blob storage lifecycle management rules
Blob storage lifecycle management offers a rule-based policy that you can use to transition your data to the desired access tier when your specified conditions are met. You can also use lifecycle management to expire data at the end of its life.

Reference:

https://docs.microsoft.com/en-us/azure/storage/blobs/access-tiers-overview

Show Answer Next Question

QUESTION: 32 Exam Topic: Design and implement data storage questions

You plan to ingest streaming social media data by using Azure Stream Analytics. The data will be stored in files in Azure Data Lake Storage, and then consumed by using Azure Databricks and PolyBase in Azure Synapse Analytics.
You need to recommend a Stream Analytics data output format to ensure that the queries from Databricks and PolyBase against the files encounter the fewest possible errors. The solution must ensure that the files can be queried quickly and that the data type information is retained.
What should you recommend?

JSON
Parquet
CSV
Avro

Answer(s): B

Explanation:

Need Parquet to support both Databricks and PolyBase.

Reference:

https://docs.microsoft.com/en-us/sql/t-sql/statements/create-external-file-format-transact-sql

Show Answer Next Question

QUESTION: 33 Exam Topic: Design and implement data storage questions

You have an Azure Synapse Analytics dedicated SQL pool named Pool1. Pool1 contains a partitioned fact table named dbo.Sales and a staging table named stg.Sales that has the matching table and partition definitions.
You need to overwrite the content of the first partition in dbo.Sales with the content of the same partition in stg.Sales. The solution must minimize load times.
What should you do?

Insert the data from stg.Sales into dbo.Sales.
Switch the first partition from dbo.Sales to stg.Sales.
Switch the first partition from stg.Sales to dbo.Sales.
Update dbo.Sales from stg.Sales.

Answer(s): C

Show Answer Next Question

Free Microsoft DP-203 Exam Questions (page: 11)

QUESTION: 30 Exam Topic: Design and implement data storage questions

Explanation:

QUESTION: 31 Exam Topic: Design and implement data storage questions

Explanation:

Reference:

QUESTION: 32 Exam Topic: Design and implement data storage questions

Explanation:

Reference:

QUESTION: 33 Exam Topic: Design and implement data storage questions

DP-203 Exam Discussions & Posts