Summer Special 60% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: bestdeal

Free Databricks Databricks-Certified-Professional-Data-Scientist Practice Exam with Questions & Answers | Set: 2

Questions 11

You are creating a Classification process where input is the income, education and current debt of a customer, what could be the possible output of this process.

Options:
A.

Probability of the customer default on loan repayment

B.

Percentage of the customer loan repayment capability

C.

Percentage of the customer should be given loan or not

D.

The output might be a risk class, such as "good", "acceptable", "average", or "unacceptable".

Databricks Databricks-Certified-Professional-Data-Scientist Premium Access
Questions 12

You are working with the Clustering solution of the customer datasets. There are almost 40 variables are available for each customer and almost 1.00,0000 customer's data is available. You want to reduce the number of variables for clustering, what would you do?

Options:
A.

You will randomly reduce the number of variables

B.

You will find the correlation among the variables and from their variables are not co-related will be discarded.

C.

You will find the correlation among the variables and from the highly co-related variables, you will be considering only one or two variables from it.

D.

You cannot discard any variable for creating clusters.

E.

You can combine several variables in one variable

Questions 13

Which of the following is not a correct application for the Classification?

Options:
A.

credit scoring

B.

tumor detection

C.

image recognition

D.

drug discovery

Questions 14

Select the correct statement which applies to K-Nearest Neighbors

Options:
A.

No Assumption about the data

B.

Computationally expensive

C.

Require less memory

D.

Works with Numeric Values

Questions 15

You are using k-means clustering to classify heart patients for a hospital. You have chosen Patient Sex, Height, Weight, Age and Income as measures and have used 3 clusters. When you create a pair-wise plot of the clusters, you notice that there is significant overlap between the clusters. What should you do?

Options:
A.

Identify additional measures to add to the analysis

B.

Remove one of the measures

C.

Decrease the number of clusters

D.

Increase the number of clusters

Questions 16

Select the correct option which applies to L2 regularization

Options:
A.

Computational efficient due to having analytical solutions

B.

Non-sparse outputs

C.

No feature selection

Questions 17

Feature Hashing approach is "SGD-based classifiers avoid the need to predetermine vector size by simply picking a reasonable size and shoehorning the training data into vectors of that size" now with large vectors or with multiple locations per feature in Feature hashing?

Options:
A.

Is a problem with accuracy

B.

It is hard to understand what classifier is doing

C.

It is easy to understand what classifier is doing

D.

Is a problem with accuracy as well as hard to understand what classifier us doing

Questions 18

Which of the following are point estimation methods?

Options:
A.

MAP

B.

MLE

C.

MMSE

Questions 19

Suppose there are three events then which formula must always be equal to P(E1|E2,E3)?

Options:
A.

P(E1,E2,E3)P(E1)/P(E2:E3)

B.

P(E1,E2;E3)/P(E2,E3)

C.

P(E1,E2|E3)P(E2|E3)P(E3)

D.

P(E1,E2|E3)P(E3)

E.

P(E1,E2,E3)P(E2)P(E3)

Questions 20

Digit recognition, is an example of.....

Options:
A.

Classification

B.

Clustering

C.

Unsupervised learning

D.

None of the above