Pre-Summer Sale Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: 70track

Free NVIDIA NCP-AII Practice Exam with Questions & Answers | Set: 2

Questions 11

After upgrading to HPL-AI 2.0 on a DGX A100 cluster, a 2x performance gain is observed. Which optimization is primarily responsible for this improvement?

Options:
A.

Reduction of problem size (N) to accelerate computation.

B.

MPI-aware GPU communication that reduces CPU bottlenecks and GPU idle time.

C.

Doubling of GPU clock speeds through firmware updates and relevant configuration.

D.

Automatic NVLink bandwidth doubling via driver updates.

NVIDIA NCP-AII Premium Access
Questions 12

A system administrator has upgraded the firmware of the DPU. What will be the state of the firmware after the upgrade?

Options:
A.

The firmware is installed on the DPU.

B.

The firmware is deleted from the DPU.

C.

The firmware is copied to the DPU but not installed.

D.

The firmware is waiting on reboot to become active.

Questions 13

A systems engineer is updating firmware across a large DGX cluster using automation. What is the best practice for minimizing risk and ensuring cluster health during and after the process?

Options:
A.

Drain nodes from the scheduler, run pre-update diagnostics, update firmware in batches, and verify health post-update before scaling to the next batch.

B.

To save time, simultaneously update all nodes in the cluster without draining or diagnostics.

C.

Update nodes that have reported faults, leaving others on older firmware.

D.

Drain nodes from the scheduler, update firmware in batches, skip diagnostics and verify health post-update before scaling to the next batch.

Questions 14

After a firmware upgrade on a DGX H100, the administrator notices that one GPU is not detected by the system. Which troubleshooting step should be performed first to identify the root cause?

Options:
A.

Review firmware update logs and run nvsm show health to check for hardware or firmware errors on the affected GPU.

B.

Remove the GPU from the system and replace it with a new one before any diagnostics.

C.

Ignore the issue and proceed with production workloads if the other GPUs are operational.

D.

Immediately re-run the firmware upgrade on all system components.

Questions 15

A system administrator needs to validate a GPU-based server and ensure that no errors occur under load. What command should be used?

Options:
A.

nvsm dump health

B.

stress-test --usage

C.

nvsm show health

D.

nvsm stress-test

Questions 16

A customer has just completed the first boot of their DGX system and is prompted to create an administrative user. What is the correct approach for setting up this user to ensure secure BMC and GRUB access?

Options:
A.

Create a unique, strong, lower-case username and password that will be used for both BMC and GRUB access, avoiding default or weak credentials.

B.

Create separate usernames for BMC and GRUB to maximize flexibility.

C.

Skip the creation of a new user and retain the default admin account for BMC and GRUB access.

D.

Use “sysadmin” as the username and a simple password for ease of management.

Questions 17

An engineer must ensure that a BlueField-3 NIC firmware download matches the cluster’s PSID. Which step is critical before installation?

Options:
A.

Check that the DPU’s BMC IP is reachable by ping.

B.

Confirm that the firmware file size matches the DPU’s flash capacity.

C.

Use mstflint -d < PCI_ID > query to validate the device PSID before selecting the firmware image.

D.

Verify that the SHA256 hash of the firmware matches NVIDIA’s public ledger.

Questions 18

If two ports must be connected, but one is SFP and one is QSFP, for example, to connect a 25 GbE HOST CHANNEL ADAPTER to a QSFP port capable of both 100 GbE and 25 GbE, which of the following solutions would best meet this requirement?

Options:
A.

SFP Connectors

B.

SFP to 1G BASE-T (RJ45) adapter

C.

QSA Adapter

Questions 19

During East-West fabric validation on a 64-GPU cluster, an engineer runs all_reduce_perf and observes an algorithm bandwidth of 350 GB/s and bus bandwidth of 656 GB/s. What does this indicate about the fabric performance?

Options:
A.

Inconclusive; rerun with point-to-point tests.

B.

Optimal performance; bus bandwidth near theoretical peak for NDR InfiniBand.

C.

Critical failure; bus bandwidth exceeds hardware capabilities.

D.

Suboptimal performance; algorithm bandwidth should match bus bandwidth.

Questions 20

A System Administrator needs to change the scheduling behavior of a single GPU to use a fixed share scheduler. What command achieves this?

Options:
A.

esxcli system module parameters set -m nvidia -p

B.

esxcli -i 0 -mig 18

C.

nvidia-smi -i 0 -mig 1

D.

mlxconfig -d /dev/mst/mt4123_pciconf0 set LINK_TYPE_P1 =2

Exam Code: NCP-AII
Certification Provider: NVIDIA
Exam Name: NVIDIA AI Infrastructure
Last Update: Jun 6, 2026
Questions: 123