Voyage AI
Voyage AI provides embedding models for semantic search and related tasks. seekdb's VoyageaiEmbeddingFunction lets you call Voyage AI from pyseekdb so your collections can use Voyage embeddings for RAG, agent memory, and other applications that depend on semantic understanding.
Using Voyage AI service requires you to follow Voyage AI's pricing rules and may incur corresponding fees. Before proceeding, please visit their official website or refer to relevant documentation to confirm and accept their pricing standards. If you do not agree, please do not proceed.
Dependencies and authentication
- Install the
pyseekdbpackage. - A valid Voyage AI API key for authentication.
Example: create a Voyage AI embedding function
Import and initialize VoyageaiEmbeddingFunction. Configuration is often provided via environment variables for easier deployment.
-
Basic usage
Use the default environment variable
VOYAGE_API_KEYand specify the model. Thevoyage-4-largemodel supports long context and high-accuracy retrieval.from pyseekdb.utils.embedding_functions import VoyageaiEmbeddingFunction
ef = VoyageaiEmbeddingFunction(
model_name="voyage-4-large"
) -
RAG-oriented usage
For RAG, set
input_typeto distinguish document indexing from query encoding.from pyseekdb.utils.embedding_functions import VoyageaiEmbeddingFunction
ef = VoyageaiEmbeddingFunction(
model_name="voyage-4-large",
input_type="document" # "document" or "query"; use "document" for collection documents
) -
Custom configuration
With models such as
voyage-4-large, you can reduce the output dimension (for example, from 1024 to 512) to lower storage and speed up search while preserving accuracy.from pyseekdb.utils.embedding_functions import VoyageaiEmbeddingFunction
ef = VoyageaiEmbeddingFunction(
model_name="voyage-4-large",
output_dimension=512 # Reduce dimension for storage and compute efficiency
)
Parameters:
model_name: Voyage AI model name (for example,voyage-3-large,voyage-4-large).api_key_env: Environment variable name for the API key (default:VOYAGE_API_KEY).input_type: Optional hint for retrieval: use"document"for documents being indexed and"query"for query texts.