Compatible Gateways

NVIDIA DGX Cloud Lepton

Use NVIDIA DGX Cloud Lepton's OpenAI-compatible API with Anvia.

NVIDIA DGX Cloud Lepton is a multi-cloud GPU compute platform that exposes OpenAI-compatible chat completions, embeddings, and inference endpoints. In Anvia, configure OpenAIClient with that baseUrl, then pass a DGX Cloud Lepton model id to completionModel(...).

Create the Client

import { AgentBuilder } from "@anvia/core";
import { OpenAIClient } from "@anvia/openai";

const client = new OpenAIClient({
  baseUrl: "https://api.dgxcloud.nvidia.com/v1",
  apiKey: process.env.NVIDIA_API_KEY,
});

const model = client.completionModel("meta/llama-3.3-70b-instruct");

const agent = new AgentBuilder("support", model)
  .instructions("Answer support questions clearly.")
  .build();

const response = await agent.prompt("Hello!").send();

console.log(response.output);

baseUrl makes Anvia use the OpenAI-compatible chat completion adapter. The model id is the DGX Cloud Lepton id, not an Anvia-specific alias. Check the DGX Cloud console for the current baseUrl host.

Get the Model List

DGX Cloud Lepton exposes a /v1/models endpoint that returns the model ids available to your key. Because the client was created with baseUrl, listModels() calls DGX Cloud Lepton's /models endpoint.

const models = await client.listModels();

console.table(
  models.data.map((model) => ({
    id: model.id,
    name: model.name,
    contextLength: model.contextLength,
  })),
);

Use the id field directly with completionModel(...).

Notes

  • DGX Cloud Lepton API keys are passed as bearer tokens in the Authorization header. Generate them in the NVIDIA Build console.
  • DGX Cloud Lepton unifies compute across NVIDIA Cloud Partners (NCPs), GPU marketplaces, and cloud providers. The OpenAI-compatible surface abstracts the underlying region and provider.
  • The platform is also the home of NVIDIA NIM microservices and accelerated endpoints such as those surfaced through build.nvidia.com. Use the same baseUrl shape for those routes.
  • Model capabilities still depend on the hosted upstream model. Test the specific model id for tool calling, structured output, streaming, and multimodal support before enabling those features.

For current details, see the NVIDIA DGX Cloud Lepton product page and the NVIDIA NIM documentation.