Skip to main content
Version: V1.1.0

Sentence Transformer

Sentence Transformer converts sentences, phrases, or short texts into high-dimensional vectors (embeddings). Semantically similar text maps to nearby vectors, so you can compare them using cosine similarity or Euclidean distance. The framework is built on pre-trained Transformer models (such as BERT and RoBERTa) and uses pooling to produce fixed-size sentence vectors. Typical use cases include semantic search, text clustering, sentence classification, information retrieval, and RAG.

Dependencies and authentication

Install the @seekdb/sentence-transformer package. The library handles model loading and encoding. Models run locally; the first time you use a given model, its weights are downloaded (for example, from Hugging Face Hub) and cached. No API key is required.

Example: create a Sentence Transformer embedding function

Call the SentenceTransformerEmbeddingFunction constructor with a config object. The Xenova/all-MiniLM-L6-v2 model is a lightweight general-purpose option.

import { SentenceTransformerEmbeddingFunction } from "@seekdb/sentence-transformer";

const ef = new SentenceTransformerEmbeddingFunction({
modelName: "Xenova/all-MiniLM-L6-v2",
device: "cpu",
normalizeEmbeddings: false,
});

Configurations:

  • modelName: Sentence Transformer model name (default: Xenova/all-MiniLM-L6-v2).
  • device: Device to run the model on (default: "cpu").
  • normalizeEmbeddings: Whether to normalize output vectors (default: false).
  • kwargs: Optional pipeline options; must be JSON-serializable.