@anvia/gemini

Gemini completion, embedding, image, and transcription models.

@anvia/gemini adapts the Google GenAI SDK into Anvia's provider-neutral interfaces. It supports completions, embeddings, image generation (both native and Imagen), and transcription.

Install

pnpm add @anvia/gemini

The Google GenAI SDK (@google/genai) is a transitive dependency.

Quick Start

import { AgentBuilder } from "@anvia/core";
import { GeminiClient } from "@anvia/gemini";

const client = new GeminiClient({ apiKey });
const model = client.completionModel("gemini-2.5-flash");

const agent = new AgentBuilder("support", model)
  .instructions("Answer clearly.")
  .build();

const response = await agent.prompt("What is Anvia?").send();
console.log(response.output);

Client Configuration

// Google AI API
const client = new GeminiClient({ apiKey });

// Vertex AI
const client = new GeminiClient({
  vertexai: true,
  project: "my-gcp-project",
  location: "us-central1",
});

// Pre-configured SDK instance
const client = new GeminiClient({ client: existingGoogleGenAIInstance });
OptionPurpose
apiKeyGoogle AI API key (required for Google AI)
vertexaiSet to true for Vertex AI
projectGCP project ID (required for Vertex AI)
locationGCP region (required for Vertex AI)
clientPre-configured GoogleGenAI instance

Provider clients only use explicit constructor values. They do not read environment variables.

Completion Models

// Default: gemini-2.5-flash
const model = client.completionModel();

// Specific model
const model = client.completionModel("gemini-3-pro");

Supports streaming and non-streaming through the standard StreamingCompletionModel interface.

Embedding Models

const embeddingModel = client.embeddingModel("gemini-embedding-001");

const embeddings = await embeddingModel.embedTexts(["hello", "world"]);
OptionDefaultPurpose
taskType--Embedding task type (e.g., "RETRIEVAL_QUERY", "RETRIEVAL_DOCUMENT")
maxBatchSizemodel defaultTexts per API call

Task Types

type GeminiEmbeddingTaskType =
  | "RETRIEVAL_QUERY"
  | "RETRIEVAL_DOCUMENT"
  | "SEMANTIC_SIMILARITY"
  | "CLASSIFICATION"
  | "CLUSTERING"
  | "QUESTION_ANSWERING"
  | "FACT_VERIFICATION";

Image Generation

Native Gemini Image Generation

const imageModel = client.imageGenerationModel("gemini-2.5-flash-preview-image");

const result = await imageModel.generate({
  prompt: "A diagram of an agent runtime",
});

// result.images[0] is Uint8Array (PNG)

Model constants: GEMINI_2_5_FLASH_IMAGE, GEMINI_3_PRO_IMAGE_PREVIEW.

Imagen Generation

const imagenModel = client.imagenGenerationModel("imagen-4-generate");

const result = await imagenModel.generate({
  prompt: "A watercolor painting of a mountain",
});

Model constant: IMAGEN_4_GENERATE.

Transcription

const transcriptionModel = client.transcriptionModel("gemini-2.5-flash");

const result = await transcriptionModel.transcribe({
  audio: audioBuffer,
});

console.log(result.text);

Uses Gemini's multimodal input to process audio.

Model Listing

const models = await client.listModels();
console.log(models.data.map((m) => m.id));

Fetches available Gemini models and returns a normalized ModelList.

Error Handling

  • Constructor throws when required credentials are missing
  • listModels() throws ModelListingError when the provider request fails
  • Model operations throw Google GenAI SDK errors directly