Databricks DATABRICKS-CERTIFIED-PROFESSIONAL-DATA-SCIENTIST Exam Questions
Databricks Certified Professional Data Scientist Exam (Page 4 )

Updated On: 21-Feb-2026

Select the choice where Regression algorithms are not best fit

  1. When the dimension of the object given
  2. Weight of the person is given
  3. Temperature in the atmosphere
  4. Employee status

Answer(s): D

Explanation:

Regression algorithms are usually employed when the data points are inherently numerical variables (such as the dimensions of an object the weight of a person, or the temperature in the atmosphere) but unlike Bayesian algorithms, they're not very good for categorical data (such as employee status or credit score description).



Question-13.
Which of the following is not the Classification algorithm?

  1. Logistic Regression
  2. Support Vector Machine
  3. Neural Network
  4. Hidden Markov Models
  5. None of the above

Answer(s): E

Explanation:

Logistic regression
Logistic regression is a model used for prediction of the probability of occurrence of an event. It makes use of several predictor variables that may be either numerical or categories.
Support Vector Machines
As with naive Bayes, Support Vector Machines (or SVMs) can be used to solve the task of assigning objects to classes. But the way this task is solved is completely different to the setting in naive Bayes.

Neural Network
Neural Networks are a means for classifying multidimensional objects.
Hidden Markov Models
Hidden Markov Models are used in multiple areas of machine learning, such as speech recognition, handwritten letter recognition, or natural language processing.



Suppose a man told you he had a nice conversation with someone on the train. Not knowing anything about this conversation, the probability that he was speaking to a woman is 50% (assuming the train had an equal number of men and women and the speaker was as likely to strike up a conversation with a man as with a woman). Now suppose he also told you that his conversational partner had long hair. It is now more likely he was speaking to a woman, since women are more likely to have long hair than men.____________
can be used to calculate the probability that the person was a woman.

  1. SVM
  2. MLE
  3. Bayes' theorem
  4. Logistic Regression

Answer(s): C

Explanation:

To see how this is done, let W represent the event that the conversation was held with a woman, and L denote the event that the conversation was held with a longhaired person. It can be assumed that women constitute half the population for this example. So, not knowing anything else, the probability that W occurs is P(W) = 0.5. Suppose it is also known that 75% of women have long hair which we denote as P(L |W) = 0.75 (read: the probability of event L given event W is 0.75, meaning that the probability of a person having long hair (event "L"): given that we already know that the person is a woman ("event W") is 75%). Likewise, suppose it is known that 15% of men have long hair, or P(L |M) = 0.15; where M is the complementary event of W: i.e.; the event that the conversation was held with a man (assuming that every human is either a man or a woman). Our goal is to calculate the probability that the conversation was held with a woman, given the fact that the person had long hair, or, in our notation, P(W |L). Using the formula for Bayes' theorem, we have:



where we have used the law of total probability to expand P(L),
The numeric answer can be obtained by substituting the above values into this formula (the algebraic multiplication is annotated using " *", the centered dot). This yields



i.e., the probability that the conversation was held with a woman, given that the person had long hair is about 83%. More examples are provided below.



Which of the following could be features?

  1. Words in the document
  2. Symptoms of a diseases
  3. Characteristics of an unidentified object
  4. 0nly 1 and 2
  5. All 1,2 and 3 are possible

Answer(s): E

Explanation:

Any dataset that can be turned into lists of features. A feature is simply something that is either present or absent for a given item. In the case of documents, the features are the words in the document but they could also be characteristics of an unidentified object symptoms of a disease, or anything else that can be said to be present of absent.



Refer to image below

  1. Option A
  2. Option B
  3. Option C
  4. Option D

Answer(s): A

Explanation:






Post your Comments and Discuss Databricks DATABRICKS-CERTIFIED-PROFESSIONAL-DATA-SCIENTIST exam dumps with other Community members:

Join the DATABRICKS-CERTIFIED-PROFESSIONAL-DATA-SCIENTIST Discussion