The business intelligence team has a dashboard configured to track various summary metrics for retail stores. This includes total sales for the previous day alongside totals and averages for a variety of time periods. The fields required to populate this dashboard have the following schema:
For demand forecasting, the Lakehouse contains a validated table of all itemized sales updated incrementally in near real-time. This table, named products_per_order, includes the following fields:
Because reporting on long-term sales trends is less volatile, analysts using the new dashboard only require data to be refreshed once daily. Because the dashboard will be queried interactively by many users throughout a normal business day, it should return results quickly and reduce total compute associated with each materialization.
Which solution meets the expectations of the end users while controlling and limiting possible costs?
- Populate the dashboard by configuring a nightly batch job to save the required values as a table overwritten with each update.
- Use Structured Streaming to configure a live dashboard against the products_per_order table within a Databricks notebook.
- Configure a webhook to execute an incremental read against products_per_order each time the dashboard is refreshed.
- Use the Delta Cache to persist the products_per_order table in memory to quickly update the dashboard with each query.
- Define a view against the products_per_order table and define the dashboard against this view.
Reveal Solution Next Question