Compatible Gateways

Groq

Use Groq's OpenAI-compatible low-latency inference with Anvia.

Groq exposes an OpenAI-compatible chat completions surface at https://api.groq.com/openai/v1. In Anvia, configure OpenAIClient with that baseUrl, then pass Groq model ids to completionModel(...). Groq also exposes a Responses API for stateful conversations.

Create the Client

import { AgentBuilder } from "@anvia/core";
import { OpenAIClient } from "@anvia/openai";

const client = new OpenAIClient({
  baseUrl: "https://api.groq.com/openai/v1",
  apiKey: process.env.GROQ_API_KEY,
});

const model = client.completionModel("llama-3.3-70b-versatile");

const agent = new AgentBuilder("support", model)
  .instructions("Answer support questions clearly.")
  .build();

const response = await agent.prompt("Hello!").send();

console.log(response.output);

baseUrl makes Anvia use the OpenAI-compatible chat completion adapter. The model id is the Groq id, not an Anvia-specific alias.

Get the Model List

Groq exposes a /v1/models endpoint that returns the model ids available to your key. Because the client was created with baseUrl, listModels() calls Groq's /models endpoint.

const models = await client.listModels();

console.table(
  models.data.map((model) => ({
    id: model.id,
    name: model.name,
    contextLength: model.contextLength,
  })),
);

Use the id field directly with completionModel(...).

Available Models

Groq focuses on open-source models with very low latency. Sample ids:

ModelTypeNotes
llama-3.3-70b-versatileChatFlagship general-purpose chat model
llama-3.1-8b-instantChatLow-latency, low-cost chat model
mixtral-8x7b-32768Chat32K context MoE chat model
gemma2-9b-itChatInstruction-tuned Google Gemma 2
whisper-large-v3AudioSpeech-to-text
whisper-large-v3-turboAudioFaster speech-to-text
distil-whisper-large-v3-enAudioDistilled English speech-to-text
playai-ttsAudioText-to-speech
playai-tts-arabicAudioArabic text-to-speech

Notes

  • Groq API keys are passed as bearer tokens in the Authorization header.
  • temperature: 0 is converted internally to 1e-8 by Groq. If you hit issues, set a float32 value > 0 and <= 2.
  • Some OpenAI parameters are not supported and return 400. Confirmed unsupported on chat completions: logprobs, logit_bias, top_logprobs, messages[].name, and n > 1.
  • Audio transcription does not support vtt or srt response formats.
  • Groq also exposes a Responses API at the same base URL for stateful multi-turn conversations with function calling.
  • Built-in tools such as web search, code execution, Wolfram Alpha, and browser search (for GPT OSS models) are available on supported models.

For current Groq API details, see the Groq OpenAI compatibility docs and the Groq model catalog.