Free Professional Data Engineer Exam Braindumps (page: 15)

Page 14 of 95

You are choosing a NoSQL database to handle telemetry data submitted from millions of Internet-of- Things (IoT) devices. The volume of data is growing at 100 TB per year, and each data entry has about 100 attributes. The data processing pipeline does not require atomicity, consistency, isolation, and durability (ACID). However, high availability and low latency are required.

You need to analyze the data by querying against individual fields.
Which three databases meet your requirements? (Choose three.)

  1. Redis
  2. HBase
  3. MySQL
  4. MongoDB
  5. Cassandra
  6. HDFS with Hive

Answer(s): B,D,F

Explanation:



Suppose you have a table that includes a nested column called "city" inside a column called "person", but when you try to submit the following query in BigQuery, it gives you an error.

SELECT person FROM `project1.example.table1` WHERE city = "London"

How would you correct the error?

  1. Add ", UNNEST(person)" before the WHERE clause.
  2. Change "person" to "person.city".
  3. Change "person" to "city.person".
  4. Add ", UNNEST(city)" before the WHERE clause.

Answer(s): A

Explanation:

To access the person.city column, you need to "UNNEST(person)" and JOIN it to table1 using a comma.


Reference:

https://cloud.google.com/bigquery/docs/reference/standard-sql/migrating-from-legacy- sql#nested_repeated_results



What are two of the benefits of using denormalized data structures in BigQuery?

  1. Reduces the amount of data processed, reduces the amount of storage required
  2. Increases query speed, makes queries simpler
  3. Reduces the amount of storage required, increases query speed
  4. Reduces the amount of data processed, increases query speed

Answer(s): B

Explanation:

Denormalization increases query speed for tables with billions of rows because BigQuery's performance degrades when doing JOINs on large tables, but with a denormalized data structure, you don't have to use JOINs, since all of the data has been combined into one table. Denormalization also makes queries simpler because you do not have to use JOIN clauses.

Denormalization increases the amount of data processed and the amount of storage required because it creates redundant data.


Reference:

https://cloud.google.com/solutions/bigquery-data-warehouse#denormalizing_data



Which of these statements about exporting data from BigQuery is false?

  1. To export more than 1 GB of data, you need to put a wildcard in the destination filename.
  2. The only supported export destination is Google Cloud Storage.
  3. Data can only be exported in JSON or Avro format.
  4. The only compression option available is GZIP.

Answer(s): C

Explanation:

Data can be exported in CSV, JSON, or Avro format. If you are exporting nested or repeated data, then CSV format is not supported.


Reference:

https://cloud.google.com/bigquery/docs/exporting-data






Post your Comments and Discuss Google Professional Data Engineer exam with other Community members:

Exam Discussions & Posts