@anvia/transformers
Local Transformers.js embedding models for Anvia.
@anvia/transformers runs Hugging Face embedding models locally using Transformers.js. No API calls, no network latency, no per-token costs. It supports a wide range of sentence transformer models from the Hugging Face Hub.
Install
pnpm add @anvia/transformers@huggingface/transformers is a transitive dependency. Models are downloaded and cached on first use.
Quick Start
import { createTransformersEmbeddingModel } from "@anvia/transformers";
const model = await createTransformersEmbeddingModel();
const embeddings = await model.embedTexts([
"Anvia is a TypeScript AI agent framework.",
"Agents can use tools and maintain conversation history.",
]);
console.log(embeddings[0].vector.length); // 384 (for all-MiniLM-L6-v2)The create call downloads and initializes the model on first run. Subsequent calls use the cached model.
Configuration
type TransformersEmbeddingModelOptions = {
model?: string; // Hugging Face model name (default: "Xenova/all-MiniLM-L6-v2")
pooling?: "mean" | "cls"; // pooling strategy (default: "mean")
normalize?: boolean; // normalize vectors (default: true)
maxBatchSize?: number; // texts per batch (default: 16)
};// Custom model
const model = await createTransformersEmbeddingModel({
model: "Xenova/all-mpnet-base-v2",
maxBatchSize: 32,
});
// CLS pooling instead of mean
const model = await createTransformersEmbeddingModel({
pooling: "cls",
normalize: false,
});Pooling Strategies
| Strategy | Description |
|---|---|
mean (default) | Average all token embeddings. Best for general-purpose similarity. |
cls | Use the [CLS] token embedding. Some models are trained specifically for this. |
Model Selection
The default Xenova/all-MiniLM-L6-v2 is a good general-purpose model (384 dimensions, fast). For higher quality:
Xenova/all-mpnet-base-v2-- 768 dimensions, better accuracyXenova/multi-qa-MiniLM-L6-cos-v1-- optimized for semantic searchXenova/paraphrase-multilingual-MiniLM-L12-v2-- multilingual support
Any model compatible with the feature-extraction pipeline from @huggingface/transformers will work.
FastEmbed vs Transformers.js
| Aspect | @anvia/fastembed | @anvia/transformers |
|---|---|---|
| Runtime | ONNX Runtime | Transformers.js (ONNX in browser/Node) |
| Model library | FastEmbed curated list | Any HF feature-extraction model |
| Default model | BGE-small-en-v1.5 | all-MiniLM-L6-v2 |
| Browser support | No | Yes |
| Batch size default | 256 | 16 |
Choose FastEmbed for a curated, optimized experience. Choose Transformers.js when you need specific Hugging Face models or browser compatibility.
Using with Vector Stores
import { createTransformersEmbeddingModel } from "@anvia/transformers";
import { QdrantVectorStore } from "@anvia/qdrant";
const model = await createTransformersEmbeddingModel();
const store = await QdrantVectorStore.connect({
collectionName: "docs",
vectorSize: 384,
});
const index = store.index(model);
const results = await index.search({ query: "How do agents work?", topK: 5 });Error Handling
- Throws if the model download fails (network issues on first run)
- Throws if the embedding count does not match the input text count
- Throws on invalid vector format from the Transformers.js runtime
Related
- Embeddings Guide for embedding concepts
- Transformers Reference for full API types
@anvia/fastembedfor an alternative local embedding package
