Anvia Docs

Cerebras Inference exposes an OpenAI-compatible chat completions surface at https://api.cerebras.ai/v1. In Anvia, configure OpenAIClient with that baseUrl, then pass Cerebras model ids to completionModel(...). Cerebras Inference is positioned around wafer-scale chips and very high token throughput.

Create the Client

import { AgentBuilder } from "@anvia/core";
import { OpenAIClient } from "@anvia/openai";

const client = new OpenAIClient({
  baseUrl: "https://api.cerebras.ai/v1",
  apiKey: process.env.CEREBRAS_API_KEY,
});

const model = client.completionModel("gpt-oss-120b");

const agent = new AgentBuilder("support", model)
  .instructions("Answer support questions clearly.")
  .build();

const response = await agent.prompt("Why is fast inference important?").send();

console.log(response.output);

baseUrl makes Anvia use the OpenAI-compatible chat completion adapter. The model id is the Cerebras id, not an Anvia-specific alias.

Get the Model List

Cerebras exposes a /v1/models endpoint that returns the model ids available to your key. Because the client was created with baseUrl, listModels() calls Cerebras's /models endpoint.

const models = await client.listModels();

console.table(
  models.data.map((model) => ({
    id: model.id,
    name: model.name,
    contextLength: model.contextLength,
  })),
);

Use the id field directly with completionModel(...).

Available Models

Cerebras Inference hosts open-source models on wafer-scale hardware. Sample ids include:

Model	Notes
`gpt-oss-120b`	OpenAI gpt-oss 120B MoE on Cerebras
`llama-3.3-70b`	Meta Llama 3.3 70B
`llama-3.1-8b`	Meta Llama 3.1 8B
`qwen-3-32b`	Alibaba Qwen 3 32B
`qwen-3-235b`	Alibaba Qwen 3 235B

Notes

Cerebras API keys are passed as bearer tokens in the Authorization header.
The OpenAI-compatible surface supports chat completions, streaming, tool calling, structured outputs, and reasoning controls on supported models.
Cerebras also publishes a first-party Python and Node SDK (cerebras_cloud_sdk / @cerebras/cerebras_cloud_sdk). Anvia uses the OpenAI shape so you can stay on OpenAIClient for the Anvia runtime.
Model capabilities still depend on the hosted upstream model. Test the specific model id for tool calling, structured output, streaming, and multimodal support before enabling those features.

For current Cerebras API details, see the Cerebras Inference quickstart and the Cerebras chat completions API reference.