Cerebras
Use Cerebras Inference through its OpenAI-compatible API with Anvia.
Cerebras Inference exposes an OpenAI-compatible chat completions surface at https://api.cerebras.ai/v1. In Anvia, configure OpenAIClient with that baseUrl, then pass Cerebras model ids to completionModel(...). Cerebras Inference is positioned around wafer-scale chips and very high token throughput.
Create the Client
import { AgentBuilder } from "@anvia/core";
import { OpenAIClient } from "@anvia/openai";
const client = new OpenAIClient({
baseUrl: "https://api.cerebras.ai/v1",
apiKey: process.env.CEREBRAS_API_KEY,
});
const model = client.completionModel("gpt-oss-120b");
const agent = new AgentBuilder("support", model)
.instructions("Answer support questions clearly.")
.build();
const response = await agent.prompt("Why is fast inference important?").send();
console.log(response.output);baseUrl makes Anvia use the OpenAI-compatible chat completion adapter. The model id is the Cerebras id, not an Anvia-specific alias.
Get the Model List
Cerebras exposes a /v1/models endpoint that returns the model ids available to your key. Because the client was created with baseUrl, listModels() calls Cerebras's /models endpoint.
const models = await client.listModels();
console.table(
models.data.map((model) => ({
id: model.id,
name: model.name,
contextLength: model.contextLength,
})),
);Use the id field directly with completionModel(...).
Available Models
Cerebras Inference hosts open-source models on wafer-scale hardware. Sample ids include:
| Model | Notes |
|---|---|
gpt-oss-120b | OpenAI gpt-oss 120B MoE on Cerebras |
llama-3.3-70b | Meta Llama 3.3 70B |
llama-3.1-8b | Meta Llama 3.1 8B |
qwen-3-32b | Alibaba Qwen 3 32B |
qwen-3-235b | Alibaba Qwen 3 235B |
Notes
- Cerebras API keys are passed as bearer tokens in the
Authorizationheader. - The OpenAI-compatible surface supports chat completions, streaming, tool calling, structured outputs, and reasoning controls on supported models.
- Cerebras also publishes a first-party Python and Node SDK (
cerebras_cloud_sdk/@cerebras/cerebras_cloud_sdk). Anvia uses the OpenAI shape so you can stay onOpenAIClientfor the Anvia runtime. - Model capabilities still depend on the hosted upstream model. Test the specific model id for tool calling, structured output, streaming, and multimodal support before enabling those features.
For current Cerebras API details, see the Cerebras Inference quickstart and the Cerebras chat completions API reference.
