Free Databricks Databricks-Certified-Associate-Developer-for-Apache-Spark-3.5 Practice Exam with Questions & Answers | Set: 2

Questions 11

You have:

DataFrame A: 128 GB of transactions

DataFrame B: 1 GB user lookup table

Which strategy is correct for broadcasting?

Options:

DataFrame B should be broadcasted because it is smaller and will eliminate the need for shuffling itself

DataFrame B should be broadcasted because it is smaller and will eliminate the need for shuffling DataFrame A

DataFrame A should be broadcasted because it is larger and will eliminate the need for shuffling DataFrame B

DataFrame A should be broadcasted because it is smaller and will eliminate the need for shuffling itself

Databricks Databricks-Certified-Associate-Developer-for-Apache-Spark-3.5 Premium Access

Questions 12

What is the risk associated with this operation when converting a large Pandas API on Spark DataFrame back to a Pandas DataFrame?

Options:

The conversion will automatically distribute the data across worker nodes

The operation will fail if the Pandas DataFrame exceeds 1000 rows

Data will be lost during conversion

The operation will load all data into the driver's memory, potentially causing memory overflow

Questions 13

A developer is running Spark SQL queries and notices underutilization of resources. Executors are idle, and the number of tasks per stage is low.

What should the developer do to improve cluster utilization?

Options:

Increase the value of spark.sql.shuffle.partitions

Reduce the value of spark.sql.shuffle.partitions

Increase the size of the dataset to create more partitions

Enable dynamic resource allocation to scale resources as needed

Questions 14

Given a DataFrame df that has 10 partitions, after running the code:

result = df.coalesce(20)

How many partitions will the result DataFrame have?

Options:

Same number as the cluster executors

Questions 15

Which UDF implementation calculates the length of strings in a Spark DataFrame?

Options:

df.withColumn("length", spark.udf("len", StringType()))

df.select(length(col("stringColumn")).alias("length"))

spark.udf.register("stringLength", lambda s: len(s))

df.withColumn("length", udf(lambda s: len(s), StringType()))

Questions 16

39 of 55.

A Spark developer is developing a Spark application to monitor task performance across a cluster.

One requirement is to track the maximum processing time for tasks on each worker node and consolidate this information on the driver for further analysis.

Which technique should the developer use?

Options:

Broadcast a variable to share the maximum time among workers.

Configure the Spark UI to automatically collect maximum times.

Use an RDD action like reduce() to compute the maximum time.

Use an accumulator to record the maximum time on the driver.

Questions 17

18 of 55.

An engineer has two DataFrames — df1 (small) and df2 (large). To optimize the join, the engineer uses a broadcast join:

from pyspark.sql.functions import broadcast

df_result = df2.join(broadcast(df1), on="id", how="inner")

What is the purpose of using broadcast() in this scenario?

Options:

It increases the partition size for df1 and df2.

It ensures that the join happens only when the id values are identical.

It reduces the number of shuffle operations by replicating the smaller DataFrame to all nodes.

It filters the id values before performing the join.

Questions 18

A data scientist is working on a project that requires processing large amounts of structured data, performing SQL queries, and applying machine learning algorithms. The data scientist is considering using Apache Spark for this task.

Which combination of Apache Spark modules should the data scientist use in this scenario?

Options:

Spark DataFrames, Structured Streaming, and GraphX

Spark SQL, Pandas API on Spark, and Structured Streaming

Spark Streaming, GraphX, and Pandas API on Spark

Spark DataFrames, Spark SQL, and MLlib

Questions 19

The following code fragment results in an error:

@F.udf(T.IntegerType())

def simple_udf(t: str) -> str:

return answer * 3.14159

Which code fragment should be used instead?

Options:

@F.udf(T.IntegerType())

def simple_udf(t: int) -> int:

return t * 3.14159

@F.udf(T.DoubleType())

def simple_udf(t: float) -> float:

return t * 3.14159

@F.udf(T.DoubleType())

def simple_udf(t: int) -> int:

return t * 3.14159

@F.udf(T.IntegerType())

def simple_udf(t: float) -> float:

return t * 3.14159

Questions 20

29 of 55.

A Spark application is experiencing performance issues in client mode due to the driver being resource-constrained.

How should this issue be resolved?

Options:

Switch the deployment mode to cluster mode.

Add more executor instances to the cluster.

Increase the driver memory on the client machine.

Switch the deployment mode to local mode.

Exam Code: Databricks-Certified-Associate-Developer-for-Apache-Spark-3.5

Certification Provider: Databricks

Exam Name: Databricks Certified Associate Developer for Apache Spark 3.5 – Python

Last Update: Dec 5, 2025

Questions: 136

How to Pass Databricks-Certified-Associate-Developer-for-Apache-Spark-3.5 Exams

PDF + Testing Engine
~~$164.99~~ $49.5 Add to Cart

Testing Engine
~~$124.99~~ $37.5 Add to Cart

PDF (Q&A)
~~$104.99~~ $31.5 Add to Cart

Databricks Related Exams

How to pass Databricks Databricks-Certified-Professional-Data-Engineer - Databricks Certified Data Engineer Professional Exam Exam

How to pass Databricks Databricks-Certified-Professional-Data-Scientist - Databricks Certified Professional Data Scientist Exam Exam

How to pass Databricks Databricks-Certified-Associate-Developer-for-Apache-Spark-3.0 - Databricks Certified Associate Developer for Apache Spark 3.0 Exam Exam

How to pass Databricks Databricks-Certified-Data-Engineer-Associate - Databricks Certified Data Engineer Associate Exam Exam

Databricks-Machine-Learning-Professional - Databricks Certified Machine Learning Professional

Databricks-Certified-Data-Analyst-Associate - Databricks Certified Data Analyst Associate Exam

Databricks-Machine-Learning-Associate - Databricks Certified Machine Learning Associate Exam

Databricks-Generative-AI-Engineer-Associate - Databricks Certified Generative AI Engineer Associate

Get Databricks Full Access

Databricks Free Exams
Examstrack provides free Databricks exam prep materials and practice tests to support your Databricks certification goals.

Big Cyber Monday Sale 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: 70track

Navigation:

examstrack logo

Hot Vendors:

Free Databricks Databricks-Certified-Associate-Developer-for-Apache-Spark-3.5 Practice Exam with Questions & Answers | Set: 2

How to Pass Databricks-Certified-Associate-Developer-for-Apache-Spark-3.5 Exams

Databricks Related Exams

Databricks Free Exams