Pre-Summer Sale Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: 70track

Free NVIDIA NCP-AAI Practice Exam with Questions & Answers | Set: 3

Questions 21

Which two orchestration methods are MOST suitable for implementing complex agentic workflows that require both external data access and specialized task delegation? (Choose two.)

Options:
A.

Agentic orchestration with specialized expert system delegation

B.

Prompt chaining to accomplish state management

C.

Manual workflow coordination without automation

D.

Retrieval-based orchestration for external data

E.

Static rule-based routing with predefined pathways

NVIDIA NCP-AAI Premium Access
Questions 22

When analyzing throughput bottlenecks in a multi-modal agent processing text, images, and audio, which Triton configuration evaluations identify optimization opportunities? (Choose two.)

Options:
A.

Analyze model ensemble pipelines for sequential dependencies, identify parallelization opportunities, and optimize inter-model data transfer using Triton’s scheduler.

B.

Profile GPU memory allocation patterns across modalities, implement model instance batching strategies, and tune concurrency limits to maximize utilization.

C.

Deploy each modality on separate Triton instances, allowing Triton to automatically manage ensemble coordination, shared memory usage, and pipeline integration.

D.

Use a single model instance per GPU, allowing Triton to automatically optimize concurrency, batching, and multi-instance settings for throughput scaling.

Questions 23

You’re deploying a healthcare-focused agentic AI system that helps doctors make treatment recommendations based on patient records. The agent’s reasoning is not exposed to users, and its decisions sometimes differ from clinical guidelines.

What safety and compliance mechanisms should be in place? (Choose two.)

Options:
A.

Allow overrides by human doctors to maintain accountability

B.

Require model explainability or traceability for all outputs

C.

Prioritize autonomous speed of decision over explainability

D.

Exempt the model from compliance if it improves outcomes

E.

Obfuscate decision logic to protect proprietary methods

Questions 24

You are developing an agent that needs to perform a complex set of tasks repeatedly.

Why is periodic fine-tuning an important aspect of long-term knowledge retention for this type of agent?

Options:
A.

It prevents the agent from becoming overly specialized to a single task.

B.

It eliminates the need for external storage like RAG.

C.

It prevents the agent from forgetting past successes and failures.

D.

It guarantees the agent will produce the same output for the same input.

Questions 25

An AI engineer at an oil and gas company is designing a multi-agent AI system to support drilling operations. Different agents are responsible for subsurface modeling, risk analysis, and resource allocation. These agents must share operational context, reason through interdependent planning steps, and justify their collaborative decisions using structured, transparent logic. The architecture must support memory persistence, sequential decision-making and chain-of-thought prompting across agents.

Which implementation best supports this design?

Options:
A.

Orchestrate NeMo agents via Triton, use vector memory for shared context, ReAct planning, and NeMo Guardrails for reasoning.

B.

Use stateless LLM endpoints behind an API gateway and pass shared prompts across agents to simulate context and reasoning.

C.

Use LangChain to coordinate third-party agent APIs and store shared information in external memory, with logic encoded in static prompt chains.

D.

Fine-tune separate NeMo models for each agent role using LoRA, with pre-scripted action flows deployed via TensorRT for latency reduction.

Questions 26

In designing an AI workflow which of the following best describes a comprehensive approach to improving the performance of AI agents?

Options:
A.

Implementing benchmarking pipelines, deploying physical agents and monitoring user engagement metrics

B.

Implementing benchmarking pipelines, collecting user feedback, and tuning model parameters iteratively

C.

Implementing benchmarking pipelines and incorporating a dynamic dataset for a real-time fall-back

D.

Monitoring agents’ throughput and time-to-first-token from the scoring engine

Questions 27

When implementing inter-agent communication for a distributed agentic system running across multiple NVIDIA GPU nodes, which message routing pattern provides the best balance of reliability and performance?

Options:
A.

Database-based message queuing with polling

B.

Direct TCP connections between all agent pairs

C.

Event-driven message routing with distributed broker clusters

D.

Centralized message broker with topic-based routing

Questions 28

In a production agentic system handling thousands of concurrent conversations, which state management strategy provides optimal performance while ensuring context preservation?

Options:
A.

Global shared state with locks for concurrent access

B.

Session-isolated state with serialization and lazy loading

C.

Stateless design with context reconstruction from message history

Questions 29

Your deployed legal assistant shows great performance but occasionally repeats incorrect legal terms.

Which tuning method best improves factual reliability?

Options:
A.

Replace retrieval with static hard-coded text snippets

B.

Use more verbose prompts to reinforce correct definitions

C.

Increase output randomness to improve exploration

D.

Add fact-checking steps using external tools during generation

Questions 30

What NVIDIA framework can be used to train a better agent?

Options:
A.

NeMo-RL

B.

NeMo Guardrails

C.

TensorRT-LLM

Exam Code: NCP-AAI
Certification Provider: NVIDIA
Exam Name: NVIDIA Agentic AI
Last Update: May 8, 2026
Questions: 121