Pre-Summer Sale Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: 70track

Free Databricks Databricks-Certified-Data-Engineer-Associate Practice Exam with Questions & Answers | Set: 5

Questions 41

A data engineer needs to apply custom logic to string column city in table stores for a specific use case. In order to apply this custom logic at scale, the data engineer wants to create a SQL user-defined function (UDF).

Which of the following code blocks creates this SQL UDF?

Options:
A.

B.

C.

D.

E.

Databricks Databricks-Certified-Data-Engineer-Associate Premium Access
Questions 42

A data engineer is developing an ETL process based on Spark SQL. The execution fails. The data engineer checks the Spark Ul and can see the ERRORS as follows:

Databricks-Certified-Data-Engineer-Associate Question 42

Which two corrective actions should the data engineer perform to resolve this issue?

Choose 2 answers - (Q) Narrow the filters in order to collect less data in the query

Options:
A.

Upsize the worker nodes and activate autoshuffle partitions

B.

Upsize the driver node and deactivate autoshuffle partitions

C.

Cache the dataset in order to boost the query performance

D.

Fix the shuffle partitions to 50 to ensure the allocation

Questions 43

Which of the following describes the type of workloads that are always compatible with Auto Loader?

Options:
A.

Dashboard workloads

B.

Streaming workloads

C.

Machine learning workloads

D.

Serverless workloads

E.

Batch workloads

Questions 44

A data engineer manages multiple external tables linked to various data sources. The data engineer wants to manage these external tables efficiently and ensure that only the necessary permissions are granted to users for accessing specific external tables.

How should the data engineer manage access to these external tables?

Options:
A.

Create a single user role with full access to all external tables and assign it to all users.

B.

Use Unity Catalog to manage access controls and permissions for each external table individually.

C.

Set up Azure Blob Storage permissions at the container level, allowing access to all external tables.

D.

Grant permissions on the Databricks workspace level, which will automatically apply to all external tables.

Questions 45

A data engineer is managing a data pipeline in Databricks, where multiple Delta tables are used for various transformations. The team wants to track how data flows through the pipeline, including identifying dependencies between Delta tables, notebooks, jobs, and dashboards. The data engineer is utilizing the Unity Catalog lineage feature to monitor this process.

How does Unity Catalog’s data lineage feature support the visualization of relationships between Delta tables, notebooks, jobs, and dashboards?

Options:
A.

Unity Catalog lineage visualizes dependencies between Delta tables, notebooks, and jobs, but does not provide column-level tracing or relationships with dashboards.

B.

Unity Catalog lineage only supports visualizing relationships at the table level and does not extend to notebooks, jobs, or dashboards.

C.

Unity Catalog lineage provides an interactive graph that tracks dependencies between tables and notebooks but excludes any job-related dependencies or dashboard visualizations.

D.

Unity Catalog provides an interactive graph that visualizes the dependencies between Delta tables, notebooks, jobs, and dashboards, while also supporting column-level tracking of data transformations.

Questions 46

Which SQL code snippet will correctly demonstrate a Data Definition Language (DDL) operation used to create a table?

Options:
A.

DROP TABLE employees;

B.

INSERT INTO employees (id, name) VALUES (1, ' Alice ' );

C.

CRFATF tabif employees ( id INT, name suing

D.

ALTFR TABIF employees add column salary DECTMA(10,2);

Questions 47

A data engineer needs to determine whether to use the built-in Databricks Notebooks versioning or version their project using Databricks Repos.

Which of the following is an advantage of using Databricks Repos over the Databricks Notebooks versioning?

Options:
A.

Databricks Repos automatically saves development progress

B.

Databricks Repos supports the use of multiple branches

C.

Databricks Repos allows users to revert to previous versions of a notebook

D.

Databricks Repos provides the ability to comment on specific changes

E.

Databricks Repos is wholly housed within the Databricks Lakehouse Platform

Questions 48

A Python file is ready to go into production and the client wants to use the cheapest but most efficient type of cluster possible. The workload is quite small, only processing 10GBs of data with only simple joins and no complex aggregations or wide transformations.

Which cluster meets the requirement?

Options:
A.

Job cluster with Photon enabled

B.

Interactive cluster

C.

Job cluster with spot instances disabled

D.

Job cluster with spot instances enabled

Questions 49

Which of the following commands can be used to write data into a Delta table while avoiding the writing of duplicate records?

Options:
A.

DROP

B.

IGNORE

C.

MERGE

D.

APPEND

E.

INSERT

Questions 50

Which two components function in the DB platform architecture’s control plane? (Choose two.)

Options:
A.

Virtual Machines

B.

Compute Orchestration

C.

Serverless Compute

D.

Compute

E.

Unity Catalog