Free Professional Data Engineer Exam Braindumps (page: 26)

Page 26 of 68

Which of these are examples of a value in a sparse vector? (Select 2 answers.)

  1. [0, 5, 0, 0, 0, 0]
  2. [0, 0, 0, 1, 0, 0, 1]
  3. [0, 1]
  4. [1, 0, 0, 0, 0, 0, 0]

Answer(s): C,D

Explanation:

Categorical features in linear models are typically translated into a sparse vector in which each possible value has a corresponding index or id. For example, if there are only three possible eye colors you can represent 'eye_color' as a length 3 vector: 'brown' would become [1, 0, 0], 'blue' would become [0, 1, 0] and 'green' would become [0, 0, 1]. These vectors are called "sparse" because they may be very long, with many zeros, when the set of possible values is very large (such as all English words). [0, 0, 0, 1, 0, 0, 1] is not a sparse vector because it has two 1s in it. A sparse vector contains only a single 1.
[0, 5, 0, 0, 0, 0] is not a sparse vector because it has a 5 in it. Sparse vectors only contain 0s and 1s.


Reference:

https://www.tensorflow.org/tutorials/linear#feature_columns_and_transformations



Which software libraries are supported by Cloud Machine Learning Engine?

  1. Theano and TensorFlow
  2. Theano and Torch
  3. TensorFlow
  4. TensorFlow and Torch

Answer(s): C

Explanation:

Cloud ML Engine mainly does two things:
Enables you to train machine learning models at scale by running TensorFlow training applications in the cloud.
Hosts those trained models for you in the cloud so that you can use them to get predictions about new data.


Reference:

https://cloud.google.com/ml-engine/docs/technical-overview#what_it_does



Which of the following statements about Legacy SQL and Standard SQL is not true?

  1. Standard SQL is the preferred query language for BigQuery.
  2. If you write a query in Legacy SQL, it might generate an error if you try to run it with Standard SQL.
  3. One difference between the two query languages is how you specify fully-qualified table names (i.e. table names that include their associated project name).
  4. You need to set a query language for each dataset and the default is Standard SQL.

Answer(s): D

Explanation:

You do not set a query language for each dataset. It is set each time you run a query and the default query language is Legacy SQL.
Standard SQL has been the preferred query language since BigQuery 2.0 was released. In legacy SQL, to query a table with a project-qualified name, you use a colon, :, as a separator. In standard SQL, you use a period, ., instead. Due to the differences in syntax between the two query languages (such as with project- qualified table names), if you write a query in Legacy SQL, it might generate an error if you try to run it with Standard SQL.


Reference:

https://cloud.google.com/bigquery/docs/reference/standard-sql/migrating-from-legacy-sql



Which of these is not a supported method of putting data into a partitioned table?

  1. If you have existing data in a separate file for each day, then create a partitioned table and upload each file into the appropriate partition.
  2. Run a query to get the records for a specific day from an existing table and for the destination table, specify a partitioned table ending with the day in the format "$YYYYMMDD".
  3. Create a partitioned table and stream new records to it every day.
  4. Use ORDER BY to put a table's rows into chronological order and then change the table's type to "Partitioned".

Answer(s): D

Explanation:

You cannot change an existing table into a partitioned table. You must create a partitioned table from scratch. Then you can either stream data into it every day and the data will automatically be put in the right partition, or you can load data into a specific partition by using "$YYYYMMDD" at the end of the table name.


Reference:

https://cloud.google.com/bigquery/docs/partitioned-tables



Page 26 of 68



Post your Comments and Discuss Google Professional Data Engineer exam with other Community members:

madhan commented on June 16, 2023
next question
EUROPEAN UNION
upvote