Anvia Docs

@anvia/transformers runs Hugging Face embedding models locally using Transformers.js. No API calls, no network latency, no per-token costs. It supports a wide range of sentence transformer models from the Hugging Face Hub.

Install

pnpm add @anvia/transformers

@huggingface/transformers is a transitive dependency. Models are downloaded and cached on first use.

Quick Start

import { createTransformersEmbeddingModel } from "@anvia/transformers";

const model = await createTransformersEmbeddingModel();

const embeddings = await model.embedTexts([
  "Anvia is a TypeScript AI agent framework.",
  "Agents can use tools and maintain conversation history.",
]);

console.log(embeddings[0].vector.length); // 384 (for all-MiniLM-L6-v2)

The create call downloads and initializes the model on first run. Subsequent calls use the cached model.

Configuration

type TransformersEmbeddingModelOptions = {
  model?: string;           // Hugging Face model name (default: "Xenova/all-MiniLM-L6-v2")
  pooling?: "mean" | "cls"; // pooling strategy (default: "mean")
  normalize?: boolean;      // normalize vectors (default: true)
  maxBatchSize?: number;    // texts per batch (default: 16)
};

// Custom model
const model = await createTransformersEmbeddingModel({
  model: "Xenova/all-mpnet-base-v2",
  maxBatchSize: 32,
});

// CLS pooling instead of mean
const model = await createTransformersEmbeddingModel({
  pooling: "cls",
  normalize: false,
});

Pooling Strategies

Strategy	Description
`mean` (default)	Average all token embeddings. Best for general-purpose similarity.
`cls`	Use the [CLS] token embedding. Some models are trained specifically for this.

Model Selection

The default Xenova/all-MiniLM-L6-v2 is a good general-purpose model (384 dimensions, fast). For higher quality:

Xenova/all-mpnet-base-v2 -- 768 dimensions, better accuracy
Xenova/multi-qa-MiniLM-L6-cos-v1 -- optimized for semantic search
Xenova/paraphrase-multilingual-MiniLM-L12-v2 -- multilingual support

Any model compatible with the feature-extraction pipeline from @huggingface/transformers will work.

FastEmbed vs Transformers.js

Aspect	`@anvia/fastembed`	`@anvia/transformers`
Runtime	ONNX Runtime	Transformers.js (ONNX in browser/Node)
Model library	FastEmbed curated list	Any HF `feature-extraction` model
Default model	BGE-small-en-v1.5	all-MiniLM-L6-v2
Browser support	No	Yes
Batch size default	256	16

Choose FastEmbed for a curated, optimized experience. Choose Transformers.js when you need specific Hugging Face models or browser compatibility.

Using with Vector Stores

import { createTransformersEmbeddingModel } from "@anvia/transformers";
import { QdrantVectorStore } from "@anvia/qdrant";

const model = await createTransformersEmbeddingModel();
const store = await QdrantVectorStore.connect({
  collectionName: "docs",
  vectorSize: 384,
});

const index = store.index(model);
const results = await index.search({ query: "How do agents work?", topK: 5 });

Error Handling

Throws if the model download fails (network issues on first run)
Throws if the embedding count does not match the input text count
Throws on invalid vector format from the Transformers.js runtime

Embeddings Guide for embedding concepts
Transformers Reference for full API types
@anvia/fastembed for an alternative local embedding package