Databricks Certified Data Engineer Professional Exam Questions
Certified Data Engineer Professional (Page 7 )

Updated On: 23-Apr-2026

The DevOps team has configured a production workload as a collection of notebooks scheduled to run daily using the Jobs UI. A new data engineering hire is onboarding to the team and has requested access to one of these notebooks to review the production logic.

What are the maximum notebook permissions that can be granted to the user without allowing accidental changes to production code or data?

  1. Can Manage
  2. Can Edit
  3. No permissions
  4. Can Read
  5. Can Run

Answer(s): D



A table named user_ltv is being used to create a view that will be used by data analysts on various teams. Users in the workspace are configured into groups, which are used for setting up data access using ACLs.

The user_ltv table has the following schema:

email STRING, age INT, ltv INT

The following view definition is executed:



An analyst who is not a member of the marketing group executes the following query:

SELECT * FROM email_ltv

Which statement describes the results returned by this query?

  1. Three columns will be returned, but one column will be named "REDACTED" and contain only null values.
  2. Only the email and ltv columns will be returned; the email column will contain all null values.
  3. The email and ltv columns will be returned with the values in user_ltv.
  4. The email.age, and ltv columns will be returned with the values in user_ltv.
  5. Only the email and ltv columns will be returned; the email column will contain the string "REDACTED" in each row.

Answer(s): E



The data governance team has instituted a requirement that all tables containing Personal Identifiable Information (PII) must be clearly annotated. This includes adding column comments, table comments, and setting the custom table property "contains_pii" = true.

The following SQL DDL statement is executed to create a new table:



Which command allows manual confirmation that these three requirements have been met?

  1. DESCRIBE EXTENDED dev.pii_test
  2. DESCRIBE DETAIL dev.pii_test
  3. SHOW TBLPROPERTIES dev.pii_test
  4. DESCRIBE HISTORY dev.pii_test
  5. SHOW TABLES dev

Answer(s): A



The data governance team is reviewing code used for deleting records for compliance with GDPR. They note the following logic is used to delete records from the Delta Lake table named users.



Assuming that user_id is a unique identifying key and that delete_requests contains all users that have requested deletion, which statement describes whether successfully executing the above logic guarantees that

the records to be deleted are no longer accessible and why?

  1. Yes; Delta Lake ACID guarantees provide assurance that the DELETE command succeeded fully and permanently purged these records.
  2. No; the Delta cache may return records from previous versions of the table until the cluster is restarted.
  3. Yes; the Delta cache immediately updates to reflect the latest data files recorded to disk.
  4. No; the Delta Lake DELETE command only provides ACID guarantees when combined with the MERGE INTO command.
  5. No; files containing deleted records may still be accessible with time travel until a VACUUM command is used to remove invalidated data files.

Answer(s): E



An external object storage container has been mounted to the location /mnt/finance_eda_bucket.

The following logic was executed to create a database for the finance team:



After the database was successfully created and permissions configured, a member of the finance team runs the following code:



If all users on the finance team are members of the finance group, which statement describes how the tx_sales table will be created?

  1. A logical table will persist the query plan to the Hive Metastore in the Databricks control plane.
  2. An external table will be created in the storage container mounted to /mnt/finance_eda_bucket.
  3. A logical table will persist the physical plan to the Hive Metastore in the Databricks control plane.
  4. An managed table will be created in the storage container mounted to /mnt/finance_eda_bucket.
  5. A managed table will be created in the DBFS root storage container.

Answer(s): D



Although the Databricks Utilities Secrets module provides tools to store sensitive credentials and avoid accidentally displaying them in plain text users should still be careful with which credentials are stored here and which users have access to using these secrets.

Which statement describes a limitation of Databricks Secrets?

  1. Because the SHA256 hash is used to obfuscate stored secrets, reversing this hash will display the value in plain text.
  2. Account administrators can see all secrets in plain text by logging on to the Databricks Accounts console.
  3. Secrets are stored in an administrators-only table within the Hive Metastore; database administrators have permission to query this table by default.
  4. Iterating through a stored secret and printing each character will display secret contents in plain text.
  5. The Databricks REST API can be used to list secrets in plain text if the personal access token has proper credentials.

Answer(s): D



What statement is true regarding the retention of job run history?

  1. It is retained until you export or delete job run logs
  2. It is retained for 30 days, during which time you can deliver job run logs to DBFS or S3
  3. It is retained for 60 days, during which you can export notebook run results to HTML
  4. It is retained for 60 days, after which logs are archived
  5. It is retained for 90 days or until the run-id is re-used through custom run configuration

Answer(s): C



A data engineer, User A, has promoted a new pipeline to production by using the REST API to programmatically create several jobs. A DevOps engineer, User B, has configured an external orchestration tool to trigger job runs through the REST API. Both users authorized the REST API calls using their personal access tokens.

Which statement describes the contents of the workspace audit logs concerning these events?

  1. Because the REST API was used for job creation and triggering runs, a Service Principal will be automatically used to identify these events.
  2. Because User B last configured the jobs, their identity will be associated with both the job creation events and the job run events.
  3. Because these events are managed separately, User A will have their identity associated with the job creation events and User B will have their identity associated with the job run events.
  4. Because the REST API was used for job creation and triggering runs, user identity will not be captured in the audit logs.
  5. Because User A created the jobs, their identity will be associated with both the job creation events and the job run events.

Answer(s): C



Viewing page 7 of 44
Viewing questions 49 - 56 out of 339 questions


Certified Data Engineer Professional Exam Discussions & Posts

AI Tutor AI Tutor 👋 I’m here to help!