Free Databricks Databricks-Machine-Learning-Associate Practice Exam with Questions & Answers | Set: 2

Name: How to Pass Databricks-Machine-Learning-Associate Exams
Brand: Examstrack
SKU: databricks-machine-learning-associate
Price: 36.75 USD
Availability: InStock

Questions 11

A data scientist is using MLflow to track their machine learning experiment. As a part of each of their MLflow runs, they are performing hyperparameter tuning. The data scientist would like to have one parent run for the tuning process with a child run for each unique combination of hyperparameter values. All parent and child runs are being manually started with mlflow.start_run.

Which of the following approaches can the data scientist use to accomplish this MLflow run organization?

Options:

Theycan turn on Databricks Autologging

Theycan specify nested=True when startingthe child run for each unique combination of hyperparameter values

Theycan start each child run inside the parentrun's indented code block usingmlflow.start runO

They can start each child run with the same experiment ID as the parent run

They can specify nested=True when starting the parent run for the tuningprocess

Databricks Databricks-Machine-Learning-Associate Premium Access

Gavin

06-Sep-2025

The testing engine on examstrack.com made my Databricks-Machine-Learning-Associate journey smoother.

Questions 12

A data scientist is wanting to explore summary statistics for Spark DataFrame spark_df. The data scientist wants to see the count, mean, standard deviation, minimum, maximum, and interquartile range (IQR) for each numerical feature.

Which of the following lines of code can the data scientist run to accomplish the task?

Options:

spark_df.summary ()

spark_df.stats()

spark_df.describe().head()

spark_df.printSchema()

spark_df.toPandas()

Questions 13

A data scientist has developed a machine learning pipeline with a static input data set using Spark ML, but the pipeline is taking too long to process. They increase the number of workers in the cluster to get the pipeline to run more efficiently. They notice that the number of rows in the training set after reconfiguring the cluster is different from the number of rows in the training set prior to reconfiguring the cluster.

Which of the following approaches will guarantee a reproducible training and test set for each model?

Options:

Manually configure the cluster

Write out the split data sets to persistent storage

Set a speed in the data splitting operation

Manually partition the input data

Questions 14

A data scientist has created a linear regression model that useslog(price)as a label variable. Using this model, they have performed inference and the predictions and actual label values are in Spark DataFramepreds_df.

They are using the following code block to evaluate the model:

regression_evaluator.setMetricName("rmse").evaluate(preds_df)

Which of the following changes should the data scientist make to evaluate the RMSE in a way that is comparable withprice?

Options:

They should exponentiate the computed RMSE value

They should take the log of the predictions before computing the RMSE

They should evaluate the MSE of the log predictions to compute the RMSE

They should exponentiate the predictions before computing the RMSE

Questions 15

A machine learning engineer is trying to scale a machine learning pipeline by distributing its feature engineering process.

Which of the following feature engineering tasks will be the least efficient to distribute?

Options:

One-hot encoding categorical features

Target encoding categorical features

Imputing missing feature values with the mean

Imputing missing feature values with the true median

Creating binary indicator features for missing values

Questions 16

A data scientist is using the following code block to tune hyperparameters for a machine learning model:

Databricks-Machine-Learning-Associate Question 16

Which change can they make the above code block to improve the likelihood of a more accurate model?

Options:

Increase num_evals to 100

Change fmin() to fmax()

Change sparkTrials() to Trials()

Change tpe.suggest to random.suggest

Questions 17

A data scientist is utilizing MLflow Autologging to automatically track their machine learning experiments. After completing a series of runs for the experiment experiment_id, the data scientist wants to identify the run_id of the run with the best root-mean-square error (RMSE).

Which of the following lines of code can be used to identify the run_id of the run with the best RMSE in experiment_id?

Databricks-Machine-Learning-Associate Question 17

Options:

OptionA

Option B

Option C

Option D

Questions 18

A machine learning engineer wants to parallelize the training of group-specific models using the Pandas Function API. They have developed thetrain_modelfunction, and they want to apply it to each group of DataFramedf.

They have written the following incomplete code block:

Databricks-Machine-Learning-Associate Question 18

Which of the following pieces of code can be used to fill in the above blank to complete the task?

Options:

applyInPandas

mapInPandas

predict

train_model

groupedApplyIn

Questions 19

A data scientist has replaced missing values in their feature set with each respective feature variable’s median value. A colleague suggests that the data scientist is throwing away valuable information by doing this.

Which of the following approaches can they take to include as much information as possible in the feature set?

Options:

Impute the missing values using each respective feature variable's mean value instead of the median value

Refrain from imputing the missing values in favor of letting the machine learning algorithm determine how to handle them

Remove all feature variables that originally contained missing values from the feature set

Create a binary feature variable for each feature that contained missing values indicating whether each row's value has been imputed

Create a constant feature variable for each feature that contained missing values indicating the percentage of rows from the feature that was originally missing

Questions 20

A data scientist has been given an incomplete notebook from the data engineering team. The notebook uses a Spark DataFrame spark_df on which the data scientist needs to perform further feature engineering. Unfortunately, the data scientist has not yet learned the PySpark DataFrame API.

Which of the following blocks of code can the data scientist run to be able to use the pandas API on Spark?

Options:

import pyspark.pandas as ps

df = ps.DataFrame(spark_df)

import pyspark.pandas as ps

df = ps.to_pandas(spark_df)

spark_df.to_pandas()

import pandas as pd

df = pd.DataFrame(spark_df)

Exam Code: Databricks-Machine-Learning-Associate

Certification Provider: Databricks

Exam Name: Databricks Certified Machine Learning Associate Exam

Last Update: Oct 30, 2025

Questions: 74

How to Pass Databricks-Machine-Learning-Associate Exams

PDF + Testing Engine
~~$164.99~~ $57.75 Add to Cart

Testing Engine
~~$124.99~~ $43.75 Add to Cart

PDF (Q&A)
~~$104.99~~ $36.75 Add to Cart

Databricks Related Exams

How to pass Databricks Databricks-Machine-Learning-Professional - Databricks Certified Machine Learning Professional Exam

Databricks-Certified-Professional-Data-Engineer - Databricks Certified Data Engineer Professional Exam

Databricks-Certified-Associate-ML-Practitioner-for-Apache-Spark-2.4 - Databricks Certified Associate ML Practitioner for Apache Spark 2.4 Exam

Databricks-Certified-Data-Engineer-Associate - Databricks Certified Data Engineer Associate Exam

Databricks-Generative-AI-Engineer-Associate - Databricks Certified Generative AI Engineer Associate

Azure-Databricks-Certified-Associate-Platform-Administrator - Azure Databricks Certified Associate Platform Administrator Exam

Databricks-Certified-Data-Analyst-Associate - Databricks Certified Data Analyst Associate Exam

Databricks-Certified-Associate-Developer-for-Apache-Spark-3.5 - Databricks Certified Associate Developer for Apache Spark 3.5 – Python

Databricks-Certified-Associate-Developer-for-Apache-Spark-3.0 - Databricks Certified Associate Developer for Apache Spark 3.0 Exam

Databricks-Certified-Associate-Developer-for-Apache-Spark-2.4 - Databricks Certified Associate Developer for Apache Spark 2.4 Exam

Databricks-Certified-Professional-Data-Scientist - Databricks Certified Professional Data Scientist Exam

Get Databricks Full Access

Databricks Free Exams
Examstrack provides free Databricks exam prep materials and practice tests to support your Databricks certification goals.

Big Halloween Sale 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: sale65best

Navigation:

examstrack logo

Hot Vendors:

Free Databricks Databricks-Machine-Learning-Associate Practice Exam with Questions & Answers | Set: 2

How to Pass Databricks-Machine-Learning-Associate Exams

Databricks Related Exams

Databricks Free Exams