Free Associate-Data-Practitioner Exam Braindumps (page: 5)

Page 4 of 19

You need to create a new data pipeline. You want a serverless solution that meets the following requirements:

· Data is streamed from Pub/Sub and is processed in real-time.

· Data is transformed before being stored.

· Data is stored in a location that will allow it to be analyzed with SQL using Looker.



Which Google Cloud services should you recommend for the pipeline?

  1. 1. Dataproc Serverless
    2. Bigtable
  2. 1. Cloud Composer
    2. Cloud SQL for MySQL
  3. 1. BigQuery
    2. Analytics Hub
  4. 1. Dataflow
    2. BigQuery

Answer(s): D

Explanation:

To build a serverless data pipeline that processes data in real-time from Pub/Sub, transforms it, and stores it for SQL-based analysis using Looker, the best solution is to use Dataflow and BigQuery. Dataflow is a fully managed service for real-time data processing and transformation, while BigQuery is a serverless data warehouse that supports SQL-based querying and integrates seamlessly with Looker for data analysis and visualization. This combination meets the requirements for real-time streaming, transformation, and efficient storage for analytical queries.



Your team wants to create a monthly report to analyze inventory data that is updated daily. You need to aggregate the inventory counts by using only the most recent month of data, and save the results to be used in a Looker Studio dashboard.
What should you do?

  1. Create a materialized view in BigQuery that uses the SUM( ) function and the DATE_SUB( ) function.
  2. Create a saved query in the BigQuery console that uses the SUM( ) function and the DATE_SUB( ) function. Re-run the saved query every month, and save the results to a BigQuery table.
  3. Create a BigQuery table that uses the SUM( ) function and the _PARTITIONDATE filter.
  4. Create a BigQuery table that uses the SUM( ) function and the DATE_DIFF( ) function.

Answer(s): A

Explanation:

Creating a materialized view in BigQuery with the SUM() function and the DATE_SUB() function is the best approach. Materialized views allow you to pre-aggregate and cache query results, making them efficient for repeated access, such as monthly reporting. By using the DATE_SUB() function, you can filter the inventory data to include only the most recent month. This approach ensures that the aggregation is up-to-date with minimal latency and provides efficient integration with Looker Studio for dashboarding.



You have a BigQuery dataset containing sales dat

  1. This data is actively queried for the first 6 months. After that, the data is not queried but needs to be retained for 3 years for compliance reasons. You need to implement a data management strategy that meets access and compliance requirements, while keeping cost and administrative overhead to a minimum.
    What should you do?
  2. Use BigQuery long-term storage for the entire dataset. Set up a Cloud Run function to delete the data from BigQuery after 3 years.
  3. Partition a BigQuery table by month. After 6 months, export the data to Coldline storage.
    Implement a lifecycle policy to delete the data from Cloud Storage after 3 years.
  4. Set up a scheduled query to export the data to Cloud Storage after 6 months. Write a stored procedure to delete the data from BigQuery after 3 years.
  5. Store all data in a single BigQuery table without partitioning or lifecycle policies.

Answer(s): B

Explanation:

Partitioning the BigQuery table by month allows efficient querying of recent data for the first 6 months, reducing query costs. After 6 months, exporting the data to Coldline storage minimizes storage costs for data that is rarely accessed but needs to be retained for compliance. Implementing a lifecycle policy in Cloud Storage automates the deletion of the data after 3 years, ensuring compliance while reducing administrative overhead. This approach balances cost efficiency and compliance requirements effectively.



You have created a LookML model and dashboard that shows daily sales metrics for five regional managers to use. You want to ensure that the regional managers can only see sales metrics specific to their region. You need an easy-to-implement solution.
What should you do?

  1. Create a sales_region user attribute, and assign each manager's region as the value of their user attribute. Add an access_filter Explore filter on the region_name dimension by using the sales_region user attribute.
  2. Create five different Explores with the sql_always_filter Explore filter applied on the region_name dimension. Set each region_name value to the corresponding region for each manager.
  3. Create separate Looker dashboards for each regional manager. Set the default dashboard filter to the corresponding region for each manager.
  4. Create separate Looker instances for each regional manager. Copy the LookML model and dashboard to each instance. Provision viewer access to the corresponding manager.

Answer(s): A

Explanation:

Using a sales_region user attribute is the best solution because it allows you to dynamically filter data based on each manager's assigned region. By adding an access_filter Explore filter on the region_name dimension that references the sales_region user attribute, each manager sees only the sales metrics specific to their region. This approach is easy to implement, scalable, and avoids duplicating dashboards or Explores, making it both efficient and maintainable.






Post your Comments and Discuss Google Associate-Data-Practitioner exam with other Community members:

Associate-Data-Practitioner Exam Discussions & Posts