Pre-Winter Special Limited Time 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: bigdisc65

Vce Databricks-Machine-Learning-Associate Questions Latest

Page: 3 / 5
Question 12

A machine learning engineer is trying to scale a machine learning pipeline by distributing its feature engineering process.

Which of the following feature engineering tasks will be the least efficient to distribute?

Options:

A.

One-hot encoding categorical features

B.

Target encoding categorical features

C.

Imputing missing feature values with the mean

D.

Imputing missing feature values with the true median

E.

Creating binary indicator features for missing values

Question 13

A data scientist has developed a linear regression model using Spark ML and computed the predictions in a Spark DataFrame preds_df with the following schema:

prediction DOUBLE

actual DOUBLE

Which of the following code blocks can be used to compute the root mean-squared-error of the model according to the data in preds_df and assign it to the rmse variable?

A)

B)

C)

D)

Options:

A.

Option A

B.

Option B

C.

Option C

D.

Option D

Question 14

A data scientist is developing a single-node machine learning model. They have a large number of model configurations to test as a part of their experiment. As a result, the model tuning process takes too long to complete. Which of the following approaches can be used to speed up the model tuning process?

Options:

A.

Implement MLflow Experiment Tracking

B.

Scale up with Spark ML

C.

Enable autoscaling clusters

D.

Parallelize with Hyperopt

Question 15

A data scientist is performing hyperparameter tuning using an iterative optimization algorithm. Each evaluation of unique hyperparameter values is being trained on a single compute node. They are performing eight total evaluations across eight total compute nodes. While the accuracy of the model does vary over the eight evaluations, they notice there is no trend of improvement in the accuracy. The data scientist believes this is due to the parallelization of the tuning process.

Which change could the data scientist make to improve their model accuracy over the course of their tuning process?

Options:

A.

Change the number of compute nodes to be half or less than half of the number of evaluations.

B.

Change the number of compute nodes and the number of evaluations to be much larger but equal.

C.

Change the iterative optimization algorithm used to facilitate the tuning process.

D.

Change the number of compute nodes to be double or more than double the number of evaluations.

Page: 3 / 5
Exam Name: Databricks Certified Machine Learning Associate Exam
Last Update: Oct 17, 2024
Questions: 74
Databricks-Machine-Learning-Associate pdf

Databricks-Machine-Learning-Associate PDF

$28  $80
Databricks-Machine-Learning-Associate Engine

Databricks-Machine-Learning-Associate Testing Engine

$33.25  $95
Databricks-Machine-Learning-Associate PDF + Engine

Databricks-Machine-Learning-Associate PDF + Testing Engine

$45.5  $130