Free Google Professional Data Engineer Exam Braindumps (page: 18)

What are two methods that can be used to denormalize tables in BigQuery?

  1. 1) Split table into multiple tables; 2) Use a partitioned table
  2. 1) Join tables into one table; 2) Use nested repeated fields
  3. 1) Use a partitioned table; 2) Join tables into one table
  4. 1) Use nested repeated fields; 2) Use a partitioned table

Answer(s): B

Explanation:

The conventional method of denormalizing data involves simply writing a fact, along with all its dimensions, into a flat table structure. For example, if you are dealing with sales transactions, you would write each individual fact to a record, along with the accompanying dimensions such as order and customer information.

The other method for denormalizing data takes advantage of BigQuery's native support for nested and repeated structures in JSON or Avro input data. Expressing records using nested and repeated structures can provide a more natural representation of the underlying data. In the case of the sales order, the outer part of a JSON structure would contain the order and customer information, and the inner part of the structure would contain the individual line items of the order, which would be represented as nested, repeated elements.


Reference:

https://cloud.google.com/solutions/bigquery-data-warehouse#denormalizing_data



Which of these is not a supported method of putting data into a partitioned table?

  1. If you have existing data in a separate file for each day, then create a partitioned table and upload each file into the appropriate partition.
  2. Run a query to get the records for a specific day from an existing table and for the destination table, specify a partitioned table ending with the day in the format "$YYYYMMDD".
  3. Create a partitioned table and stream new records to it every day.
  4. Use ORDER BY to put a table's rows into chronological order and then change the table's type to "Partitioned".

Answer(s): D

Explanation:

You cannot change an existing table into a partitioned table. You must create a partitioned table from scratch. Then you can either stream data into it every day and the data will automatically be put in the right partition, or you can load data into a specific partition by using "$YYYYMMDD" at the end of the table name.


Reference:

https://cloud.google.com/bigquery/docs/partitioned-tables



Which of these operations can you perform from the BigQuery Web UI?

  1. Upload a file in SQL format.
  2. Load data with nested and repeated fields.
  3. Upload a 20 MB file.
  4. Upload multiple files using a wildcard.

Answer(s): B

Explanation:

You can load data with nested and repeated fields using the Web UI.

You cannot use the Web UI to:

- Upload a file greater than 10 MB in size

- Upload multiple files at the same time

- Upload a file in SQL format

All three of the above operations can be performed using the "bq" command.


Reference:

https://cloud.google.com/bigquery/loading-data



Which methods can be used to reduce the number of rows processed by BigQuery?

  1. Splitting tables into multiple tables; putting data in partitions
  2. Splitting tables into multiple tables; putting data in partitions; using the LIMIT clause
  3. Putting data in partitions; using the LIMIT clause
  4. Splitting tables into multiple tables; using the LIMIT clause

Answer(s): A

Explanation:

If you split a table into multiple tables (such as one table for each day), then you can limit your query to the data in specific tables (such as for particular days). A better method is to use a partitioned table, as long as your data can be separated by the day.

If you use the LIMIT clause, BigQuery will still process the entire table.


Reference:

https://cloud.google.com/bigquery/docs/partitioned-tables






Post your Comments and Discuss Google Professional Data Engineer exam prep with other Community members:

Professional Data Engineer Exam Discussions & Posts