Anvia Docs

@anvia/fastembed runs embedding models locally using ONNX runtime via the FastEmbed library. No API calls, no network latency, no per-token costs. Use it when you want embeddings computed entirely on your machine or CI server.

Install

pnpm add @anvia/fastembed

FastEmbed (fastembed) is a transitive dependency. Models are downloaded and cached on first use.

Quick Start

import { createFastEmbedEmbeddingModel } from "@anvia/fastembed";

const model = await createFastEmbedEmbeddingModel();

const embeddings = await model.embedTexts([
  "Anvia is a TypeScript AI agent framework.",
  "Agents can use tools and maintain conversation history.",
]);

console.log(embeddings[0].vector.length); // 384 (for BGESmallENV15)

The create call downloads and initializes the model on first run. Subsequent calls use the cached model.

Configuration

type FastEmbedEmbeddingModelOptions = {
  model?: FastEmbedEmbeddingModelName;  // model name
  maxBatchSize?: number;                // texts per batch (default: 256)
  initOptions?: {
    executionProviders?: ExecutionProvider[];  // ONNX runtime providers
    maxLength?: number;                        // max token length
    cacheDir?: string;                         // model cache directory
    showDownloadProgress?: boolean;            // download progress bar
    modelName?: string;                        // custom model name
  };
};

// Custom model
const model = await createFastEmbedEmbeddingModel({
  model: "BAAI/bge-base-en-v1.5",
  maxBatchSize: 128,
});

// Custom cache directory
const model = await createFastEmbedEmbeddingModel({
  initOptions: { cacheDir: "./models", showDownloadProgress: true },
});

Available Models

The default model is BAAI/bge-small-en-v1.5 (384 dimensions, fast).

Other options depend on the FastEmbed version. Check the FastEmbed documentation for the full list of supported models.

When to Use Local Embeddings

Scenario	Recommendation
Development and testing	FastEmbed (no API key needed)
CI/CD pipelines	FastEmbed (no network dependency)
Privacy-sensitive data	FastEmbed (data stays local)
High throughput production	Provider embeddings (dedicated infrastructure)
Large model quality	Provider embeddings (bigger models available)

Using with Vector Stores

import { createFastEmbedEmbeddingModel } from "@anvia/fastembed";
import { ChromaVectorStore } from "@anvia/chroma";

const model = await createFastEmbedEmbeddingModel();
const store = await ChromaVectorStore.connect({ collectionName: "docs" });

// Create an index for searching
const index = store.index(model);
const results = await index.search({ query: "How do agents work?", topK: 5 });

Error Handling

Throws if the model download fails (network issues on first run)
Throws if the embedding count does not match the input text count
Throws on invalid batch format from the FastEmbed runtime

Embeddings Guide for embedding concepts
FastEmbed Reference for full API types
@anvia/transformers for an alternative local embedding package