Databricks Databricks-Certified-Professional-Data-Engineer Exam Questions
Certified Data Engineer Professional (Page 8 )

Updated On: 21-Feb-2026

Each configuration below is identical to the extent that each cluster has 400 GB total of RAM, 160 total cores and only one Executor per VM.
Given a job with at least one wide transformation, which of the following cluster configurations will result in maximum performance?


  1. • Total VMs; 1
    • 400 GB per Executor
    • 160 Cores / Executor

  2. • Total VMs: 8
    • 50 GB per Executor
    • 20 Cores / Executor

  3. • Total VMs: 16
    • 25 GB per Executor
    • 10 Cores/Executor

  4. • Total VMs: 4
    • 100 GB per Executor
    • 40 Cores/Executor

  5. • Total VMs:2
    • 200 GB per Executor
    • 80 Cores / Executor

Answer(s): A



A junior data engineer on your team has implemented the following code block.


The view new_events contains a batch of records with the same schema as the events Delta table. The event_id field serves as a unique key for this table.
When this query is executed, what will happen with new records that have the same event_id as an existing record?

  1. They are merged.
  2. They are ignored.
  3. They are updated.
  4. They are inserted.
  5. They are deleted.

Answer(s): B



A junior data engineer seeks to leverage Delta Lake's Change Data Feed functionality to create a Type 1 table representing all of the values that have ever been valid for all rows in a bronze table created with the property delta.enableChangeDataFeed = true. They plan to execute the following code as a daily job:


Which statement describes the execution and results of running the above query multiple times?

  1. Each time the job is executed, newly updated records will be merged into the target table, overwriting previous values with the same primary keys.
  2. Each time the job is executed, the entire available history of inserted or updated records will be appended to the target table, resulting in many duplicate entries.
  3. Each time the job is executed, the target table will be overwritten using the entire history of inserted or updated records, giving the desired result.
  4. Each time the job is executed, the differences between the original and current versions are calculated; this may result in duplicate entries for some records.
  5. Each time the job is executed, only those records that have been inserted or updated since the last execution will be appended to the target table, giving the desired result.

Answer(s): B



A new data engineer notices that a critical field was omitted from an application that writes its Kafka source to Delta Lake. This happened even though the critical field was in the Kafka source. That field was further missing from data written to dependent, long-term storage. The retention threshold on the Kafka service is seven days. The pipeline has been in production for three months.
Which describes how Delta Lake can help to avoid data loss of this nature in the future?

  1. The Delta log and Structured Streaming checkpoints record the full history of the Kafka producer.
  2. Delta Lake schema evolution can retroactively calculate the correct value for newly added fields, as long as the data was in the original source.
  3. Delta Lake automatically checks that all fields present in the source data are included in the ingestion layer.
  4. Data can never be permanently dropped or deleted from Delta Lake, so data loss is not possible under any circumstance.
  5. Ingesting all raw data and metadata from Kafka to a bronze Delta table creates a permanent, replayable history of the data state.

Answer(s): E



A nightly job ingests data into a Delta Lake table using the following code:


The next step in the pipeline requires a function that returns an object that can be used to manipulate new records that have not yet been processed to the next table in the pipeline.
Which code snippet completes this function definition?
def new_records():

  1. return spark.readStream.table("bronze")
  2. return spark.readStream.load("bronze")

  3. return spark.read.option("readChangeFeed", "true").table ("bronze")

Answer(s): E






Post your Comments and Discuss Databricks Databricks-Certified-Professional-Data-Engineer exam dumps with other Community members:

Join the Databricks-Certified-Professional-Data-Engineer Discussion