Databricks Certified Data Engineer Associate Exam Questions
Certified Data Engineer Associate (Page 3 )

Updated On: 15-Feb-2026

Which of the following code blocks will remove the rows where the value in column age is greater than 25 from the existing Delta table my_table and save the updated table?

  1. SELECT * FROM my_table WHERE age > 25;
  2. UPDATE my_table WHERE age > 25;
  3. DELETE FROM my_table WHERE age > 25;
  4. UPDATE my_table WHERE age <= 25;
  5. DELETE FROM my_table WHERE age <= 25;

Answer(s): C



A data engineer has realized that they made a mistake when making a daily update to a table. They need to use Delta time travel to restore the table to a version that is 3 days old. However, when the data engineer attempts to time travel to the older version, they are unable to restore the data because the data files have been deleted.

Which of the following explains why the data files are no longer present?

  1. The VACUUM command was run on the table
  2. The TIME TRAVEL command was run on the table
  3. The DELETE HISTORY command was run on the table
  4. The OPTIMIZE command was nun on the table
  5. The HISTORY command was run on the table

Answer(s): A

Explanation:

The most likely reason why the data files are no longer present when the data engineer attempts to time travel to an older version of a Delta table is that the VACUUM command was run on the table. The VACUUM command removes files that are no longer in use by the Delta table, including files that are required for time travel. Therefore, if the VACUUM command is run on a Delta table, it can make it impossible to use time travel to recover older versions of the table.



Which of the following Git operations must be performed outside of Databricks Repos?

  1. Commit
  2. Pull
  3. Push
  4. Clone
  5. Merge

Answer(s): E

Explanation:

For following tasks, work in your Git provider:

Create a pull request.
Resolve merge conflicts.
Merge or delete branches.
Rebase a branch.


Reference:

https://docs.databricks.com/repos/index.html



Which of the following data lakehouse features results in improved data quality over a traditional data lake?

  1. A data lakehouse provides storage solutions for structured and unstructured data.
  2. A data lakehouse supports ACID-compliant transactions.
  3. A data lakehouse allows the use of SQL queries to examine data.
  4. A data lakehouse stores data in open formats.
  5. A data lakehouse enables machine learning and artificial Intelligence workloads.

Answer(s): B

Explanation:

ACID-compliant transactions ensure that data is consistent, reliable, and accurate. By supporting ACID transactions, a data lakehouse can provide improved data quality over a traditional data lake. This is because ACID transactions guarantee that updates to the data are either completed in their entirety or not at all, reducing the risk of data corruption or errors.



A data engineer needs to determine whether to use the built-in Databricks Notebooks versioning or version their project using Databricks Repos.

Which of the following is an advantage of using Databricks Repos over the Databricks Notebooks versioning?

  1. Databricks Repos automatically saves development progress
  2. Databricks Repos supports the use of multiple branches
  3. Databricks Repos allows users to revert to previous versions of a notebook
  4. Databricks Repos provides the ability to comment on specific changes
  5. Databricks Repos is wholly housed within the Databricks Lakehouse Platform

Answer(s): B






Post your Comments and Discuss Databricks Certified Data Engineer Associate exam dumps with other Community members:

Join the Certified Data Engineer Associate Discussion