Databricks Certified Data Engineer Associate Exam
Certified Data Engineer Associate (Page 17 )

Updated On: 19-Jan-2026

A data engineering team has two tables. The first table march_transactions is a collection of all retail transactions in the month of March. The second table april_transactions is a collection of all retail transactions in the month of April. There are no duplicate records between the tables.

Which of the following commands should be run to create a new table all_transactions that contains all records from march_transactions and april_transactions without duplicate records?


  1. CREATE TABLE all_transactions AS
    SELECT * FROM march_transactions
    INNER JOIN SELECT * FROM april_transactions;

  2. CREATE TABLE all_transactions AS
    SELECT * FROM march_transactions
    UNION SELECT * FROM april_transactions;

  3. CREATE TABLE all_transactions AS
    SELECT * FROM march_transactions
    OUTER JOIN SELECT * FROM april_transactions;

  4. CREATE TABLE all_transactions AS
    SELECT * FROM march_transactions
    INTERSECT SELECT * from april_transactions;

  5. CREATE TABLE all_transactions AS
    SELECT * FROM march_transactions
    MERGE SELECT * FROM april_transactions;

Answer(s): B



A data engineer only wants to execute the final block of a Python program if the Python variable day_of_week is equal to 1 and the Python variable review_period is True.

Which of the following control flow statements should the data engineer use to begin this conditionally executed code block?

  1. if day_of_week = 1 and review_period:
  2. if day_of_week = 1 and review_period = "True":
  3. if day_of_week == 1 and review_period == "True":
  4. if day_of_week == 1 and review_period:
  5. if day_of_week = 1 &review_period: = "True":

Answer(s): D

Explanation:

In python value comparison is done by double equal signs (==). in case of boolean values that are TRUE these may be omitted. Quotes around True would result in string comparison and here we are comparing to a bool value.



A data engineer is attempting to drop a Spark SQL table my_table. The data engineer wants to delete all table metadata and data.

They run the following command:
DROP TABLE IF EXISTS my_table

While the object no longer appears when they run SHOW TABLES, the data files still exist.
Which of the following describes why the data files still exist and the metadata files were deleted?

  1. The table’s data was larger than 10 GB
  2. The table’s data was smaller than 10 GB
  3. The table was external
  4. The table did not have a location
  5. The table was managed

Answer(s): C



A data engineer wants to create a data entity from a couple of tables. The data entity must be used by other data engineers in other sessions. It also must be saved to a physical location.
Which of the following data entities should the data engineer create?

  1. Database
  2. Function
  3. View
  4. Temporary view
  5. Table

Answer(s): E



Viewing page 17 of 36
Viewing questions 65 - 68 out of 198 questions



Post your Comments and Discuss Databricks Certified Data Engineer Associate exam prep with other Community members:

Join the Certified Data Engineer Associate Discussion