Sentence Transformer
Sentence Transformer converts sentences, phrases, or short texts into high-dimensional vectors (embeddings). Semantically similar text maps to nearby vectors, so you can compare them using cosine similarity or Euclidean distance. The framework is built on pre-trained Transformer models (such as BERT and RoBERTa) and uses pooling to produce fixed-size sentence vectors. Typical use cases include semantic search, text clustering, sentence classification, information retrieval, and RAG.
Dependencies and authentication
Install the @seekdb/sentence-transformer package. The library handles model loading and encoding. Models run locally; the first time you use a given model, its weights are downloaded (for example, from Hugging Face Hub) and cached. No API key is required.
Example: create a Sentence Transformer embedding function
Call the SentenceTransformerEmbeddingFunction constructor with a config object. The Xenova/all-MiniLM-L6-v2 model is a lightweight general-purpose option.
import { SentenceTransformerEmbeddingFunction } from "@seekdb/sentence-transformer";
const ef = new SentenceTransformerEmbeddingFunction({
modelName: "Xenova/all-MiniLM-L6-v2",
device: "cpu",
normalizeEmbeddings: false,
});
Configurations:
- modelName: Sentence Transformer model name (default:
Xenova/all-MiniLM-L6-v2). - device: Device to run the model on (default:
"cpu"). - normalizeEmbeddings: Whether to normalize output vectors (default:
false). - kwargs: Optional pipeline options; must be JSON-serializable.