Pre-Summer Sale Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: 70track

Free NVIDIA NCP-AII Practice Exam with Questions & Answers

Questions 1

What information does the ' ibnodes ' command display?

Options:
A.

All hosts & switches

B.

All host & server names

C.

All server names

D.

All channel adapters

NVIDIA NCP-AII Premium Access
Questions 2

During a 48-hour NeMo question-answering model burn-in test, GPU memory errors occur when processing large datasets. Which configuration strategy prevents Out-of-Memory (OOM) errors while maintaining processing efficiency?

Options:
A.

Set blocksize= " 1GB " for data loading and enable RMM asynchronous allocation.

B.

Switch from FP16 to FP32 precision for numerical stability.

C.

Disable add_filename for Parquet files to reduce metadata.

D.

Increase files_per_partition to 1000 for larger batch processing.

Questions 3

After configuring HA, the administrator runs cmsh status and notices the secondary head node reports mysql [FAIL]. What is the most likely cause?

Options:
A.

The BCM license expired after HA configuration.

B.

Network connectivity issues between the primary and secondary head nodes.

C.

The secondary head node lacks NVIDIA GPU drivers.

D.

The cluster nodes are powered on during the HA configuration.

Questions 4

A system administrator is installing a GPU into a server and needs to avoid damaging the device. What item should be used?

Options:
A.

Anti-ESD strap

B.

Gloves

C.

Protective film

D.

Electric screwdriver

Questions 5

After ClusterKit reports " GPU-Host latency exceeds threshold, " which NVIDIA diagnostic tool should be used to isolate hardware faults?

Options:
A.

Re-run ClusterKit with --stress=gpu -Y 60 to extend test duration

B.

nvidia-smi topo -m to inspect GPU topology connections

C.

DCGM Diags dcgmi diag -r 2

D.

ib_write_bw to measure InfiniBand bandwidth between nodes

Questions 6

After updating BlueField-3 DPU BMC firmware via Redfish, the engineer observes “TaskState: Running” but no progress after 15 minutes. How should they track the update’s completion status?

Options:
A.

Check /var/log/messages on the DPU operating system for update logs.

B.

Query the DPU BMC with the Task ID of the installation process.

C.

Power cycle the DPU immediately to force a rollback.

D.

Run bfrec --status on the DPU to view flash progress.

Questions 7

If two ports must be connected, but one is SFP and one is QSFP, for example, to connect a 25 GbE Host Channel Adapter to a QSFP port capable of both 100 GbE and 25 GbE, which solution would best meet this requirement?

Options:
A.

QSA adapter.

B.

SFP connectors.

C.

SFP-to-1G BASE-T RJ45 adapter.

D.

Standard QSFP-to-QSFP DAC cable.

Questions 8

An enterprise IT team has completed the physical installation of an AI Factory with a Spectrum-X Ethernet network connected to all GPU servers. They now need to ensure the environment is ready for scalable AI workload deployment. What is the recommended sequence of validation steps?

Options:
A.

Set up Active Directory and LDAP, configure role-based access controls and security settings first, install users, and skip network or hardware performance validation.

B.

Perform application benchmarking first, use performance logs to identify bottlenecks, update switch and server firmware afterward, and then tune the network using performance tests.

C.

Validate the software stack, test link connectivity and port health, run network benchmarks, run OSPF, ensure neighbors are exchanging route information, then stage AI workload tests.

D.

Confirm switch and server firmware configuration, test link connectivity and port health, run network benchmarks, validate the software stack, then stage AI workload tests.

Questions 9

Which of the following steps are essential components of a recommended DGX cluster installation procedure?

Pick the 2 correct responses below.

Options:
A.

Group nodes by function during initial setup and assign them to relevant categories in the cluster management tool.

B.

Configure networking by validating all interfaces on each node, ensuring proper InfiniBand and Ethernet connectivity prior to installing cluster software.

C.

Install Slurm on the head node and then configure the compute nodes’ default OS images.

D.

Complete application containerization, run distributed jobs, and skip validation of node health or storage availability.

Questions 10

A DGX H100 system shows intermittent “Link Down” errors on a 200G DAC cable. CVT reports “No Signal” despite physical connection. What is the first hardware check?

Options:
A.

Replace the switch’s optical transceiver with a higher-wattage model.

B.

Reconfigure the port for 100G speeds via NVIDIA MST.

C.

Upgrade all leaf switches to support RS-FEC.

D.

Verify cable compatibility via the ConnectX-7 firmware validated adapters list and inspect connectors for damage.

Exam Code: NCP-AII
Certification Provider: NVIDIA
Exam Name: NVIDIA AI Infrastructure
Last Update: Jun 6, 2026
Questions: 123