Skip to main content
Version: V1.1.0

Create a custom embedding function

You can create a custom embedding function by implementing the EmbeddingFunction interface. The function must include the following:

  • Implement the name property, generate method, getConfig method, and buildFromConfig static method.
  • Optionally implement other interface methods.
  • Use registerEmbeddingFunction to register the custom embedding function for serialization/restoration.

Prerequisites

Before creating a custom embedding function, make sure the following conditions are met:

  • Implement the generate method:

    • Accepts: string[] (a list of documents).
    • Returns: Promise<number[][]> (a list of vectors).
    • Each vector must have the same dimension.
  • Set the name property:

    • The unique name of the embeddingFunction.
    • Accepts: string (embeddingFunction name).
  • Set the getConfig method:

    • The configuration of the current embeddingFunction instance, used to restore this embeddingFunction instance. This is an internal SDK method and does not need to be called directly.
    • Returns: Record<string: any>
  • Set the buildFromConfig static method:

    • Generates a new instance of the current embeddingFunction based on the required configuration. This is an internal SDK method and does not need to be called directly.
    • Accepts: Record<string: any> the configuration required to instantiate the embeddingFunction.
    • Returns: EmbeddingFunction
  • Handle edge cases:

    • Empty input should return an empty list.
    • All vectors in the output must have the same dimension.
  • dimension(): Returns the dimension of the current instance. This is an internal SDK method and does not need to be called directly.

Example 1: Sentence Transformer Custom Embedding Function

import { registerEmbeddingFunction } from "seekdb";

interface MyCustomEmbeddingConfig {
apiKeyEnv: string;
}
class MyCustomEmbeddingFunction implements EmbeddingFunction {
readonly name = "my_custom_embedding"
private apiKeyEnv: string;
dimension: number;

constructor(config: MyCustomEmbeddingConfig) {
this.apiKeyEnv = config.apiKeyEnv;
this.dimension = 384;
}

async generate(texts: string[]): Promise<number[][]> {
// Implement your vector generation logic
// For example, call an external API or use a local model
const embeddings: number[][] = [];

/**
* Fill in your vector generation code here
*/

return embeddings;
}

getConfig(): MyCustomEmbeddingConfig {
return {
apiKeyEnv: this.apiKeyEnv,
};
}

static buildFromConfig(config: MyCustomEmbeddingConfig): EmbeddingFunction {
return new MyCustomEmbeddingFunction(config);
}
}

// Register your custom embedding function
registerEmbeddingFunction("my_custom_embedding", MyCustomEmbeddingFunction);

// Use the custom embedding function
const customEmbed = new MyCustomEmbeddingFunction({
apiKeyEnv: "MY_CUSTOM_API_KEY_ENV",
});

const collection = await client.createCollection({
name: "custom_embed_collection",
configuration: {
dimension: 384,
distance: "cosine"
},
embeddingFunction: customEmbed
});

If you need to use other built-in functions or custom functions, you can refer to the following documentation to create and use custom functions: