Free Certified Data Engineer Professional Exam Braindumps (page: 25)

Page 25 of 46

A data architect has heard about Delta Lake’s built-in versioning and time travel capabilities. For auditing purposes, they have a requirement to maintain a full record of all valid street addresses as they appear in the customers table.

The architect is interested in implementing a Type 1 table, overwriting existing records with new values and relying on Delta Lake time travel to support long-term auditing. A data engineer on the project feels that a Type 2 table will provide better performance and scalability.

Which piece of information is critical to this decision?

  1. Data corruption can occur if a query fails in a partially completed state because Type 2 tables require setting multiple fields in a single update.
  2. Shallow clones can be combined with Type 1 tables to accelerate historic queries for long-term versioning.
  3. Delta Lake time travel cannot be used to query previous versions of these tables because Type 1 changes modify data files in place.
  4. Delta Lake time travel does not scale well in cost or latency to provide a long-term versioning solution.
  5. Delta Lake only supports Type 0 tables; once records are inserted to a Delta Lake table, they cannot be modified.

Answer(s): D



A table named user_ltv is being used to create a view that will be used by data analysts on various teams. Users in the workspace are configured into groups, which are used for setting up data access using ACLs.

The user_ltv table has the following schema:
email STRING, age INT, ltv INT

The following view definition is executed:


An analyst who is not a member of the auditing group executes the following query:
SELECT * FROM user_ltv_no_minors

Which statement describes the results returned by this query?

  1. All columns will be displayed normally for those records that have an age greater than 17; records not meeting this condition will be omitted.
  2. All age values less than 18 will be returned as null values, all other columns will be returned with the values in user_ltv.
  3. All values for the age column will be returned as null values, all other columns will be returned with the values in user_ltv.
  4. All records from all columns will be displayed with the values in user_ltv.
  5. All columns will be displayed normally for those records that have an age greater than 18; records not meeting this condition will be omitted.

Answer(s): A



The data governance team is reviewing code used for deleting records for compliance with GDPR. The following logic has been implemented to propagate delete requests from the user_lookup table to the user_aggregates table.

Assuming that user_id is a unique identifying key and that all users that have requested deletion have been removed from the user_lookup table, which statement describes whether successfully executing the above logic guarantees that the records to be deleted from the user_aggregates table are no longer accessible and why?

  1. No; the Delta Lake DELETE command only provides ACID guarantees when combined with the MERGE INTO command.
  2. No; files containing deleted records may still be accessible with time travel until a VACUUM command is used to remove invalidated data files.
  3. Yes; the change data feed uses foreign keys to ensure delete consistency throughout the Lakehouse.
  4. Yes; Delta Lake ACID guarantees provide assurance that the DELETE command succeeded fully and permanently purged these records.
  5. No; the change data feed only tracks inserts and updates, not deleted records.

Answer(s): B



The data engineering team has been tasked with configuring connections to an external database that does not have a supported native connector with Databricks. The external database already has data security configured by group membership. These groups map directly to user groups already created in Databricks that represent various teams within the company.

A new login credential has been created for each group in the external database. The Databricks Utilities Secrets module will be used to make these credentials available to Databricks users.

Assuming that all the credentials are configured correctly on the external database and group membership is properly configured on Databricks, which statement describes how teams can be granted the minimum necessary access to using these credentials?

  1. "Manage" permissions should be set on a secret key mapped to those credentials that will be used by a given team.
  2. "Read" permissions should be set on a secret key mapped to those credentials that will be used by a given team.
  3. "Read" permissions should be set on a secret scope containing only those credentials that will be used by a given team.
  4. "Manage" permissions should be set on a secret scope containing only those credentials that will be used by a given team.
    No additional configuration is necessary as long as all users are configured as administrators in the workspace where secrets have been added.

Answer(s): C



Page 25 of 46



Post your Comments and Discuss Databricks Certified Data Engineer Professional exam with other Community members:

Puran commented on September 18, 2024
Good material and very honest and knowledgeable support team. Contacted the support team and got a reply in less than 30 minutes.
New Zealand
upvote