Summer Special 60% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: bestdeal

Free Databricks Databricks-Certified-Data-Engineer-Associate Practice Exam with Questions & Answers | Set: 3

Questions 21

In which of the following scenarios should a data engineer select a Task in the Depends On field of a new Databricks Job Task?

Options:
A.

When another task needs to be replaced by the new task

B.

When another task needs to fail before the new task begins

C.

When another task has the same dependency libraries as the new task

D.

When another task needs to use as little compute resources as possible

E.

When another task needs to successfully complete before the new task begins

Databricks Databricks-Certified-Data-Engineer-Associate Premium Access
Questions 22

A data engineer runs a statement every day to copy the previous day’s sales into the table transactions. Each day’s sales are in their own file in the location "/transactions/raw".

Today, the data engineer runs the following command to complete this task:

Databricks-Certified-Data-Engineer-Associate Question 22

After running the command today, the data engineer notices that the number of records in table transactions has not changed.

Which of the following describes why the statement might not have copied any new records into the table?

Options:
A.

The format of the files to be copied were not included with the FORMAT_OPTIONS keyword.

B.

The names of the files to be copied were not included with the FILES keyword.

C.

The previous day’s file has already been copied into the table.

D.

The PARQUET file format does not support COPY INTO.

E.

The COPY INTO statement requires the table to be refreshed to view the copied rows.

Questions 23

A data engineer needs to determine whether to use the built-in Databricks Notebooks versioning or version their project using Databricks Repos.

Which of the following is an advantage of using Databricks Repos over the Databricks Notebooks versioning?

Options:
A.

Databricks Repos automatically saves development progress

B.

Databricks Repos supports the use of multiple branches

C.

Databricks Repos allows users to revert to previous versions of a notebook

D.

Databricks Repos provides the ability to comment on specific changes

E.

Databricks Repos is wholly housed within the Databricks Lakehouse Platform

Questions 24

A data engineer needs to apply custom logic to string column city in table stores for a specific use case. In order to apply this custom logic at scale, the data engineer wants to create a SQL user-defined function (UDF).

Which of the following code blocks creates this SQL UDF?

Options:
A.

Databricks-Certified-Data-Engineer-Associate Question 24 Option 1

B.

24

C.

24

D.

24

E.

24

Questions 25

A data engineer that is new to using Python needs to create a Python function to add two integers together and return the sum?

Which of the following code blocks can the data engineer use to complete this task?

A)

Databricks-Certified-Data-Engineer-Associate Question 25

B)

Databricks-Certified-Data-Engineer-Associate Question 25

C)

Databricks-Certified-Data-Engineer-Associate Question 25

D)

Databricks-Certified-Data-Engineer-Associate Question 25

E)

Databricks-Certified-Data-Engineer-Associate Question 25

Options:
A.

Option A

B.

Option B

C.

Option C

D.

Option D

E.

Option E

Questions 26

Which of the following can be used to simplify and unify siloed data architectures that are specialized for specific use cases?

Options:
A.

None of these

B.

Data lake

C.

Data warehouse

D.

All of these

E.

Data lakehouse

Questions 27

In which of the following file formats is data from Delta Lake tables primarily stored?

Options:
A.

Delta

B.

CSV

C.

Parquet

D.

JSON

E.

A proprietary, optimized format specific to Databricks

Questions 28

Which of the following data workloads will utilize a Gold table as its source?

Options:
A.

A job that enriches data by parsing its timestamps into a human-readable format

B.

A job that aggregates uncleaned data to create standard summary statistics

C.

A job that cleans data by removing malformatted records

D.

A job that queries aggregated data designed to feed into a dashboard

E.

A job that ingests raw data from a streaming source into the Lakehouse

Questions 29

A data engineer needs to create a table in Databricks using data from their organization’s existing SQLite database.

They run the following command:

Databricks-Certified-Data-Engineer-Associate Question 29

Which of the following lines of code fills in the above blank to successfully complete the task?

Options:
A.

org.apache.spark.sql.jdbc

B.

autoloader

C.

DELTA

D.

sqlite

E.

org.apache.spark.sql.sqlite

Questions 30

A data engineer needs to apply custom logic to identify employees with more than 5 years of experience in array column employees in table stores. The custom logic should create a new column exp_employees that is an array of all of the employees with more than 5 years of experience for each row. In order to apply this custom logic at scale, the data engineer wants to use the FILTER higher-order function.

Which of the following code blocks successfully completes this task?

Databricks-Certified-Data-Engineer-Associate Question 30

Options:
A.

Option A

B.

Option B

C.

Option C

D.

Option D

E.

Option E