@anvia/fastembed
Local FastEmbed embedding models for Anvia.
@anvia/fastembed runs embedding models locally using ONNX runtime via the FastEmbed library. No API calls, no network latency, no per-token costs. Use it when you want embeddings computed entirely on your machine or CI server.
Install
pnpm add @anvia/fastembedFastEmbed (fastembed) is a transitive dependency. Models are downloaded and cached on first use.
Quick Start
import { createFastEmbedEmbeddingModel } from "@anvia/fastembed";
const model = await createFastEmbedEmbeddingModel();
const embeddings = await model.embedTexts([
"Anvia is a TypeScript AI agent framework.",
"Agents can use tools and maintain conversation history.",
]);
console.log(embeddings[0].vector.length); // 384 (for BGESmallENV15)The create call downloads and initializes the model on first run. Subsequent calls use the cached model.
Configuration
type FastEmbedEmbeddingModelOptions = {
model?: FastEmbedEmbeddingModelName; // model name
maxBatchSize?: number; // texts per batch (default: 256)
initOptions?: {
executionProviders?: ExecutionProvider[]; // ONNX runtime providers
maxLength?: number; // max token length
cacheDir?: string; // model cache directory
showDownloadProgress?: boolean; // download progress bar
modelName?: string; // custom model name
};
};// Custom model
const model = await createFastEmbedEmbeddingModel({
model: "BAAI/bge-base-en-v1.5",
maxBatchSize: 128,
});
// Custom cache directory
const model = await createFastEmbedEmbeddingModel({
initOptions: { cacheDir: "./models", showDownloadProgress: true },
});Available Models
The default model is BAAI/bge-small-en-v1.5 (384 dimensions, fast).
Other options depend on the FastEmbed version. Check the FastEmbed documentation for the full list of supported models.
When to Use Local Embeddings
| Scenario | Recommendation |
|---|---|
| Development and testing | FastEmbed (no API key needed) |
| CI/CD pipelines | FastEmbed (no network dependency) |
| Privacy-sensitive data | FastEmbed (data stays local) |
| High throughput production | Provider embeddings (dedicated infrastructure) |
| Large model quality | Provider embeddings (bigger models available) |
Using with Vector Stores
import { createFastEmbedEmbeddingModel } from "@anvia/fastembed";
import { ChromaVectorStore } from "@anvia/chroma";
const model = await createFastEmbedEmbeddingModel();
const store = await ChromaVectorStore.connect({ collectionName: "docs" });
// Create an index for searching
const index = store.index(model);
const results = await index.search({ query: "How do agents work?", topK: 5 });Error Handling
- Throws if the model download fails (network issues on first run)
- Throws if the embedding count does not match the input text count
- Throws on invalid batch format from the FastEmbed runtime
Related
- Embeddings Guide for embedding concepts
- FastEmbed Reference for full API types
@anvia/transformersfor an alternative local embedding package
