Compatible Gateways

Together AI

Use Together AI's OpenAI-compatible open-model inference with Anvia.

Together AI exposes an OpenAI-compatible surface at https://api.together.ai/v1 covering chat completions, vision, function calling, structured outputs, embeddings, image generation, text-to-speech, and speech-to-text. In Anvia, configure OpenAIClient with that baseUrl, then pass Together model ids to completionModel(...).

Create the Client

import { AgentBuilder } from "@anvia/core";
import { OpenAIClient } from "@anvia/openai";

const client = new OpenAIClient({
  baseUrl: "https://api.together.ai/v1",
  apiKey: process.env.TOGETHER_API_KEY,
});

const model = client.completionModel("openai/gpt-oss-20b");

const agent = new AgentBuilder("support", model)
  .instructions("Answer support questions clearly.")
  .build();

const response = await agent.prompt("Hello!").send();

console.log(response.output);

baseUrl makes Anvia use the OpenAI-compatible chat completion adapter. Together model ids are namespaced as <provider>/<model_name> and are not interchangeable with OpenAI model strings.

Get the Model List

Together exposes a /v1/models endpoint. Because the client was created with baseUrl, listModels() calls Together's /models endpoint.

const models = await client.listModels();

console.table(
  models.data.map((model) => ({
    id: model.id,
    name: model.name,
    contextLength: model.contextLength,
  })),
);

Use the id field directly with completionModel(...).

Endpoint Coverage

CapabilityOpenAI SDK callTogether status
Chat completions (with streaming)chat.completions.createSupported
Vision (image inputs)chat.completions.create with image contentSupported
Function callingchat.completions.create with tools and tool_choiceSupported
Structured outputschat.completions.create with response_formatSupported
Embeddingsembeddings.createSupported
Image generationimages.generateSupported
Text-to-speechaudio.speech.createSupported
Speech-to-text and translationaudio.transcriptions.create, audio.translations.createSupported
Responses APIresponses.createNot supported — use chat.completions.create
Assistants, Threads, Runsassistants.*, threads.*, runs.*Not supported — build agent loops with function calling
OpenAI-shaped Batch and Filesbatches.*, files.*Not supported — use Together's native Batch and Files APIs
Moderationsmoderations.createNot supported — use Llama Guard via chat completions

Notes

  • Together API keys are passed as bearer tokens in the Authorization header.
  • Model identifiers are namespaced. Use Together ids like openai/gpt-oss-20b or meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8. OpenAI ids such as gpt-4o or text-embedding-3-large return 404.
  • Parameter quirks: seed is best-effort and not guaranteed deterministic. n is supported on most chat models but not all. logit_bias is unsupported on most models. service_tier, store, metadata, and prediction are accepted but ignored. reasoning_effort works on GPT-OSS models (low, medium, high).
  • Response shape: usage includes prompt_tokens, completion_tokens, and total_tokens. Some models add Together-only fields like cached_tokens and reasoning_tokens. Reasoning models return traces in a top-level reasoning field on the assistant message rather than OpenAI's nested object.
  • Errors follow the OpenAI shape { "error": { "message", "type", "code" } }, but type and code values are Together's. Match on HTTP status (400, 401, 404, 429, 500, 503) for portable handling.
  • Together-native endpoints not exposed through the OpenAI SDK include video generation, image edits, reasoning controls beyond reasoning_effort, and Together's richer logprobs surface.

For current Together API details, see the Together OpenAI compatibility docs and the Together model catalog.