The data engineering team is migrating an enterprise system with thousands of tables and views into the Lakehouse. They plan to implement the target architecture using a series of bronze, silver, and gold tables. Bronze tables will almost exclusively be used by production data engineering workloads, while silver tables will be used to support both data engineering and machine learning workloads. Gold tables will largely serve business intelligence and reporting purposes. While personal identifying information (PII) exists in all tiers of data, pseudonymization and anonymization rules are in place for all data at the silver and gold levels.The organization is interested in reducing security concerns while maximizing the ability to collaborate across diverse teams.Which statement exemplifies best practices for implementing this system?
Answer(s): A
The data architect has mandated that all tables in the Lakehouse should be configured as external Delta Lake tables.Which approach will ensure that this requirement is met?
Answer(s): C
To reduce storage and compute costs, the data engineering team has been tasked with curating a series of aggregate tables leveraged by business intelligence dashboards, customer-facing applications, production machine learning models, and ad hoc analytical queries.The data engineering team has been made aware of new requirements from a customer-facing application, which is the only downstream workload they manage entirely. As a result, an aggregate table used bynumerous teams across the organization will need to have a number of fields renamed, and additional fields will also be added.Which of the solutions addresses the situation while minimally interrupting other teams in the organization without increasing the number of tables that need to be managed?
Answer(s): B
A Delta Lake table representing metadata about content posts from users has the following schema:user_id LONG, post_text STRING, post_id STRING, longitude FLOAT, latitude FLOAT, post_time TIMESTAMP, date DATEThis table is partitioned by the date column. A query is run with the following filter:longitude < 20 & longitude > -20Which statement describes how data will be filtered?
Answer(s): D
A small company based in the United States has recently contracted a consulting firm in India to implement several new data engineering pipelines to power artificial intelligence applications. All the company's data is stored in regional cloud storage in the United States.The workspace administrator at the company is uncertain about where the Databricks workspace used by the contractors should be deployed.Assuming that all data governance considerations are accounted for, which statement accurately informs thisdecision?
The downstream consumers of a Delta Lake table have been complaining about data quality issues impacting performance in their applications. Specifically, they have complained that invalid latitude and longitude values in the activity_details table have been breaking their ability to use other geolocation processes.A junior engineer has written the following code to add CHECK constraints to the Delta Lake table:A senior engineer has confirmed the above logic is correct and the valid ranges for latitude and longitude are provided, but the code fails when executed.Which statement explains the cause of this failure?
Which of the following is true of Delta Lake and the Lakehouse?
The view updates represents an incremental batch of all newly ingested data to be inserted or updated in the customers table.The following logic is used to process these records.Which statement describes this implementation?
Post your Comments and Discuss Databricks Certified Data Engineer Professional exam dumps with other Community members:
💬 Did you find this helpful?
Thank you for sharing! Your feedback helps the community.