Create a custom embedding function
You can create a custom embedding function by implementing the EmbeddingFunction interface. The function must include the following:
- Implement the
nameproperty,generatemethod,getConfigmethod, andbuildFromConfigstatic method. - Optionally implement other interface methods.
- Use
registerEmbeddingFunctionto register the custom embedding function for serialization/restoration.
Prerequisites
Before creating a custom embedding function, make sure the following conditions are met:
-
Implement the
generatemethod:- Accepts:
string[](a list of documents). - Returns:
Promise<number[][]>(a list of vectors). - Each vector must have the same dimension.
- Accepts:
-
Set the
nameproperty:- The unique name of the
embeddingFunction. - Accepts:
string(embeddingFunctionname).
- The unique name of the
-
Set the
getConfigmethod:- The configuration of the current
embeddingFunctioninstance, used to restore thisembeddingFunctioninstance. This is an internal SDK method and does not need to be called directly. - Returns:
Record<string: any>
- The configuration of the current
-
Set the
buildFromConfigstatic method:- Generates a new instance of the current
embeddingFunctionbased on the required configuration. This is an internal SDK method and does not need to be called directly. - Accepts:
Record<string: any>the configuration required to instantiate theembeddingFunction. - Returns:
EmbeddingFunction
- Generates a new instance of the current
-
Handle edge cases:
- Empty input should return an empty list.
- All vectors in the output must have the same dimension.
-
dimension(): Returns the dimension of the current instance. This is an internal SDK method and does not need to be called directly.
Example 1: Sentence Transformer Custom Embedding Function
import { registerEmbeddingFunction } from "seekdb";
interface MyCustomEmbeddingConfig {
apiKeyEnv: string;
}
class MyCustomEmbeddingFunction implements EmbeddingFunction {
readonly name = "my_custom_embedding"
private apiKeyEnv: string;
dimension: number;
constructor(config: MyCustomEmbeddingConfig) {
this.apiKeyEnv = config.apiKeyEnv;
this.dimension = 384;
}
async generate(texts: string[]): Promise<number[][]> {
// Implement your vector generation logic
// For example, call an external API or use a local model
const embeddings: number[][] = [];
/**
* Fill in your vector generation code here
*/
return embeddings;
}
getConfig(): MyCustomEmbeddingConfig {
return {
apiKeyEnv: this.apiKeyEnv,
};
}
static buildFromConfig(config: MyCustomEmbeddingConfig): EmbeddingFunction {
return new MyCustomEmbeddingFunction(config);
}
}
// Register your custom embedding function
registerEmbeddingFunction("my_custom_embedding", MyCustomEmbeddingFunction);
// Use the custom embedding function
const customEmbed = new MyCustomEmbeddingFunction({
apiKeyEnv: "MY_CUSTOM_API_KEY_ENV",
});
const collection = await client.createCollection({
name: "custom_embed_collection",
configuration: {
dimension: 384,
distance: "cosine"
},
embeddingFunction: customEmbed
});
Related operations
If you need to use other built-in functions or custom functions, you can refer to the following documentation to create and use custom functions: