Anvia Docs

NVIDIA NIM for large language models exposes OpenAI-compatible inference endpoints. For local NIM containers, the default base URL is commonly the running NIM server, such as http://localhost:8000/v1.

Create the Client

import { AgentBuilder } from "@anvia/core";
import { OpenAIClient } from "@anvia/openai";

const client = new OpenAIClient({
  baseUrl: "http://localhost:8000/v1",
  apiKey: process.env.NVIDIA_NIM_API_KEY ?? "local",
});

const model = client.completionModel("meta/llama-3.1-8b-instruct");

const agent = new AgentBuilder("support", model)
  .instructions("Answer support questions clearly.")
  .build();

const response = await agent.prompt("Hello!").send();

console.log(response.output);

For NVIDIA-hosted endpoints, use the base URL and credentials from your NVIDIA API catalog or deployment configuration.

Get the Model List

NIM exposes GET /v1/models for loaded and available inference models. Configure OpenAIClient with your NIM base URL, then call listModels().

const baseUrl = process.env.NVIDIA_NIM_BASE_URL ?? "http://localhost:8000/v1";

const client = new OpenAIClient({
  baseUrl,
  apiKey: process.env.NVIDIA_NIM_API_KEY ?? "local",
});

const models = await client.listModels();

console.table(models.data.map((model) => ({ id: model.id, owner: model.ownedBy })));

Notes

NIM deployments can be local, private, or hosted. Keep baseUrl in your app configuration instead of hardcoding deployment-specific hosts.
NIM exposes additional management endpoints that are outside Anvia's normalized completion model surface.

For current NVIDIA NIM API details, see the NIM LLM API reference.