Get started with SeekDB AI Function Service
This tutorial will guide you through the process of getting started with SeekDB AI Function Service, helping you understand how to leverage AI capabilities for practical applications and experience the power of AI-native databases within minutes.
Overview
AI functions integrate AI model capabilities directly into data processing within the database using SQL expressions. This feature significantly simplifies data extraction, analysis, summarization, and storage operations using large AI models, making it a crucial new feature in the database and data warehouse domain. seekdb provides comprehensive AI model and endpoint management through the DBMS_AI_SERVICE package, includes built-in AI function expressions, and supports monitoring AI model calls via views. You can directly call AI models in SQL without additional coding, and experience core functions like AI_COMPLETE, AI_EMBED, AI_RERANK, and AI_PROMPT within minutes:
AI_EMBED: Converts text data into vector data by calling an embedding model.AI_COMPLETE: Processes prompts and data using a specified large text generation model and parses the results.AI_PROMPT: Organizes prompt templates and dynamic data into JSON format, which can be directly used in theAI_COMPLETEfunction by replacing thepromptparameter.AI_RERANK: Sorts text based on similarity using a reranking model.
This feature is applicable to scenarios such as text generation, text conversion, and text reranking.
Prerequisites
-
Contact the administrator to obtain the corresponding database connection string and execute the following command to connect to the database:
# host: the IP address for connecting to the seekdb database.
# port: the port for connecting to the seekdb database.
# database_name: the name of the database to be accessed.
# user_name: the username of the database.
# password: the password of the database.
mysql -h$host -P$port -u$user_name -p$password -D$database_name -
Ensure that you have the AI function call permission. Each example provides complete model and endpoint registration information, which you can directly copy and use.
Step 1: Use AI_EMBED to generate vectors
The AI_EMBED function can convert text into vectors for vector search. This is the foundational step in vector search, where text data is transformed into high-dimensional vector representations to facilitate similarity calculations.
Register an embedding model and endpoint
- Aliyun
- Aliyun DashScope
- SiliconFlow
- Tencent Hunyuan
This example demonstrates how to register an embedding model and endpoint for Alibaba Cloud (OpenAI-compatible format).
CALL DBMS_AI_SERVICE.DROP_AI_MODEL ('ob_embed');
CALL DBMS_AI_SERVICE.DROP_AI_MODEL_ENDPOINT ('ob_embed_endpoint');
CALL DBMS_AI_SERVICE.CREATE_AI_MODEL(
'ob_embed', '{
"type": "dense_embedding",
"model_name": "BAAI/bge-m3"
}');
CALL DBMS_AI_SERVICE.CREATE_AI_MODEL_ENDPOINT (
'ob_embed_endpoint', '{
"ai_model_name": "ob_embed",
"url": "https://dashscope.aliyuncs.com/compatible-mode/v1/embeddings",
-- Replace with your actual access key
"access_key": "sk-xxxxxxxxxxxxxxxxxxxxxxxxxxx",
"provider": "aliyun-openAI"
}');
This example demonstrates how to register an embedding model and endpoint for Alibaba Cloud DashScope (non-OpenAI-compatible format).
CALL DBMS_AI_SERVICE.DROP_AI_MODEL ('ob_embed');
CALL DBMS_AI_SERVICE.DROP_AI_MODEL_ENDPOINT ('ob_embed_endpoint');
CALL DBMS_AI_SERVICE.CREATE_AI_MODEL(
'ob_embed', '{
"type": "dense_embedding",
"model_name": "BAAI/bge-m3"
}');
CALL DBMS_AI_SERVICE.CREATE_AI_MODEL_ENDPOINT (
'ob_embed_endpoint', '{
"ai_model_name": "ob_embed",
"url": "https://dashscope.aliyuncs.com/api/v1/services/embeddings/text-embedding/text-embedding",
-- Replace with your actual access key
"access_key": "sk-xxxxxxxxxxxxxxxxxxxxxxxxxxx",
"provider": "aliyun-dashscope"
}');
This example demonstrates how to register an embedding model and endpoint for SiliconFlow.
CALL DBMS_AI_SERVICE.DROP_AI_MODEL ('ob_embed');
CALL DBMS_AI_SERVICE.DROP_AI_MODEL_ENDPOINT ('ob_embed_endpoint');
CALL DBMS_AI_SERVICE.CREATE_AI_MODEL(
'ob_embed', '{
"type": "dense_embedding",
"model_name": "BAAI/bge-m3"
}');
CALL DBMS_AI_SERVICE.CREATE_AI_MODEL_ENDPOINT (
'ob_embed_endpoint', '{
"ai_model_name": "ob_embed",
"url": "https://api.siliconflow.cn/v1/embeddings",
-- Replace with your actual access key
"access_key": "sk-xxxxxxxxxxxxxxxxxxxxxxxxxxx",
"provider": "siliconflow"
}');
This example demonstrates how to register an embedding model and endpoint for Tencent Hunyuan (OpenAI-compatible format).
CALL DBMS_AI_SERVICE.DROP_AI_MODEL ('ob_embed');
CALL DBMS_AI_SERVICE.DROP_AI_MODEL_ENDPOINT ('ob_embed_endpoint');
CALL DBMS_AI_SERVICE.CREATE_AI_MODEL(
'ob_embed', '{
"type": "dense_embedding",
"model_name": "BAAI/bge-m3"
}');
CALL DBMS_AI_SERVICE.CREATE_AI_MODEL_ENDPOINT (
'ob_embed_endpoint', '{
"ai_model_name": "ob_embed",
"url": "https://api.hunyuan.cloud.tencent.com/v1/embeddings",
-- Replace with your actual access key
"access_key": "sk-xxxxxxxxxxxxxxxxxxxxxxxxxxx",
"provider": "hunyuan-openAI"
}');
Try embedding single-line data
SELECT AI_EMBED("ob_embed", "Hello world") AS embedding;
The expected result is a vector array, such as [0.1, 0.2, 0.3]. This allows you to convert text in the table into vectors for subsequent vector search operations.
Step 2: Use AI_COMPLETE and AI_PROMPT to generate text
You can directly call a large language model in SQL to generate, translate, and analyze text. The AI_PROMPT function organizes a prompt template and dynamic data into a JSON format, which you can use to replace the prompt parameter of the AI_COMPLETE function.
Register a text generation model and endpoint
- Aliyun
- Aliyun DashScope
- DeepSeek
- SiliconFlow
- Tencent Hunyuan
This example shows how to register a text generation model and endpoint for Aliyun (OpenAI-compatible format).
CALL DBMS_AI_SERVICE.DROP_AI_MODEL ('ob_complete');
CALL DBMS_AI_SERVICE.DROP_AI_MODEL_ENDPOINT ('ob_complete_endpoint');
CALL DBMS_AI_SERVICE.CREATE_AI_MODEL(
'ob_complete', '{
"type": "completion",
"model_name": "THUDM/GLM-4-9B-0414"
}');
CALL DBMS_AI_SERVICE.CREATE_AI_MODEL_ENDPOINT (
'ob_complete_endpoint', '{
"ai_model_name": "ob_complete",
"url": "https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions",
-- Replace this value with your actual access_key.
"access_key": "sk-xxxxxxxxxxxxxxxxxxxxxxxxxxx",
"provider": "aliyun-openAI"
}');
This example shows how to register a text generation model and endpoint for Aliyun DashScope (non-OpenAI-compatible format).
CALL DBMS_AI_SERVICE.DROP_AI_MODEL ('ob_complete');
CALL DBMS_AI_SERVICE.DROP_AI_MODEL_ENDPOINT ('ob_complete_endpoint');
CALL DBMS_AI_SERVICE.CREATE_AI_MODEL(
'ob_complete', '{
"type": "completion",
"model_name": "THUDM/GLM-4-9B-0414"
}');
CALL DBMS_AI_SERVICE.CREATE_AI_MODEL_ENDPOINT (
'ob_complete_endpoint', '{
"ai_model_name": "ob_complete",
"url": "https://dashscope.aliyuncs.com/api/v1/services/aigc/text-generation/generation",
-- Replace this value with your actual access_key.
"access_key": "sk-xxxxxxxxxxxxxxxxxxxxxxxxxxx",
"provider": "aliyun-dashscope"
}');
This example shows how to register a text generation model and endpoint for DeepSeek.
CALL DBMS_AI_SERVICE.DROP_AI_MODEL ('ob_complete');
CALL DBMS_AI_SERVICE.DROP_AI_MODEL_ENDPOINT ('ob_complete_endpoint');
CALL DBMS_AI_SERVICE.CREATE_AI_MODEL(
'ob_complete', '{
"type": "completion",
"model_name": "THUDM/GLM-4-9B-0414"
}');
CALL DBMS_AI_SERVICE.CREATE_AI_MODEL_ENDPOINT (
'ob_complete_endpoint', '{
"ai_model_name": "ob_complete",
"url": "https://api.deepseek.com/chat/completions",
-- Replace this value with your actual access_key.
"access_key": "sk-xxxxxxxxxxxxxxxxxxxxxxxxxxx",
"provider": "deepseek"
}');
CALL DBMS_AI_SERVICE.DROP_AI_MODEL ('ob_complete');
CALL DBMS_AI_SERVICE.DROP_AI_MODEL_ENDPOINT ('ob_complete_endpoint');
CALL DBMS_AI_SERVICE.CREATE_AI_MODEL(
'ob_complete', '{
"type": "completion",
"model_name": "THUDM/GLM-4-9B-0414"
}');
CALL DBMS_AI_SERVICE.CREATE_AI_MODEL_ENDPOINT (
'ob_complete_endpoint', '{
"ai_model_name": "ob_complete",
"url": "https://api.siliconflow.cn/v1/chat/completions",
-- Replace this value with your actual access_key.
"access_key": "sk-xxxxxxxxxxxxxxxxxxxxxxxxxxx",
"provider": "siliconflow"
}');
This example shows how to register a text generation model and endpoint for Tencent Hunyuan (OpenAI-compatible format).
CALL DBMS_AI_SERVICE.DROP_AI_MODEL ('ob_complete');
CALL DBMS_AI_SERVICE.DROP_AI_MODEL_ENDPOINT ('ob_complete_endpoint');
CALL DBMS_AI_SERVICE.CREATE_AI_MODEL(
'ob_complete', '{
"type": "completion",
"model_name": "THUDM/GLM-4-9B-0414"
}');
CALL DBMS_AI_SERVICE.CREATE_AI_MODEL_ENDPOINT (
'ob_complete_endpoint', '{
"ai_model_name": "ob_complete",
"url": "https://api.hunyuan.cloud.tencent.com/v1/chat/completions",
-- Replace this value with your actual access_key.
"access_key": "sk-xxxxxxxxxxxxxxxxxxxxxxxxxxx",
"provider": "hunyuan-openAI"
}');
Try sentiment analysis
SELECT AI_COMPLETE("ob_complete", AI_PROMPT('Your task is to perform sentiment analysis on the provided text and determine whether the sentiment is positive or negative.
The text to be analyzed is as follows:
<text>
{0}
</text>
The judgment criteria are as follows:
If the text expresses a positive sentiment, output 1; if the text expresses a negative sentiment, output -1. Do not output anything else.', 'The weather is really nice')) AS sentiment;
The returned result is as follows:
+----------+
| sentiment|
+----------+
| 1 |
+----------+
Step 3: Use AI_RERANK to optimize search results
The AI_RERANK function can be used to re-sort search results based on the relevance of the query terms.
Register the reranking model and endpoint
- Aliyun DashScope
- Siliconflow
This example demonstrates how to register the reranking model and endpoint for Aliyun DashScope (non-compliant with the OpenAI format).
CALL DBMS_AI_SERVICE.DROP_AI_MODEL ('ob_rerank');
CALL DBMS_AI_SERVICE.DROP_AI_MODEL_ENDPOINT ('ob_rerank_endpoint');
CALL DBMS_AI_SERVICE.CREATE_AI_MODEL(
'ob_rerank', '{
"type": "rerank",
"model_name": "BAAI/bge-reranker-v2-m3"
}');
CALL DBMS_AI_SERVICE.CREATE_AI_MODEL_ENDPOINT (
'ob_rerank_endpoint', '{
"ai_model_name": "ob_rerank",
"url": "https://dashscope.aliyuncs.com/api/v1/services/rerank/text-rerank/text-rerank",
-- Replace with your actual access key.
"access_key": "sk-xxxxxxxxxxxxxxxxxxxxxxxxxxx",
"provider": "aliyun-dashscope"
}');
CALL DBMS_AI_SERVICE.DROP_AI_MODEL ('ob_rerank');
CALL DBMS_AI_SERVICE.DROP_AI_MODEL_ENDPOINT ('ob_rerank_endpoint');
CALL DBMS_AI_SERVICE.CREATE_AI_MODEL(
'ob_rerank', '{
"type": "rerank",
"model_name": "BAAI/bge-reranker-v2-m3"
}');
CALL DBMS_AI_SERVICE.CREATE_AI_MODEL_ENDPOINT (
'ob_rerank_endpoint', '{
"ai_model_name": "ob_rerank",
"url": "https://api.siliconflow.cn/v1/rerank",
-- Replace with your actual access key.
"access_key": "sk-xxxxxxxxxxxxxxxxxxxxxxxxxxx",
"provider": "siliconflow"
}');
Try reranking
SELECT AI_RERANK("ob_rerank", "Apple", '["apple", "banana", "fruit", "vegetable"]');
The returned result is as follows:
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| AI_RERANK("ob_rerank", "Apple", '["apple", "banana", "fruit", "vegetable"]') |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| [{"index": 0, "relevance_score": 0.9911285638809204}, {"index": 1, "relevance_score": 0.0030552432872354984}, {"index": 2, "relevance_score": 0.0003349370090290904}, {"index": 3, "relevance_score": 0.00001892922773549799}] |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set
Reranking can significantly improve the accuracy of search results, especially in RAG scenarios.
Step 4: Build an intelligent Q&A system
In this section, you will learn to build a simple AI chatbot by using three AI functions.
Register all required models and endpoints
This example uses the embedding model, text generation model, and reranking model in sequence. Ensure that you have registered the corresponding models and endpoints. For more information, see Step 1, Step 2, and Step 3.
If you have already registered the corresponding model, skip the corresponding registration step and proceed to the next step.
Prepare data and generate vectors
CREATE TABLE knowledge_base (
id INT AUTO_INCREMENT PRIMARY KEY,
title VARCHAR(255),
content TEXT,
embedding TEXT
);
INSERT INTO knowledge_base (title, content) VALUES
('About seekdb', 'seekdb is a powerful database system that supports vector search and AI functions.'),
('Vector search', 'Vector search can be used for semantic search to find similar content.'),
('AI functions', 'AI models can be directly called in SQL statements.');</numerusform>
UPDATE knowledge_base
SET embedding = AI_EMBED("ob_embed", content);
Perform vector search and reranking
SET @query = "What is vector search?";
SET @query_vector = AI_EMBED("ob_embed", @query);
-- Build a document list in an array of strings.
SET @candidate_docs = '["SeekDB is a powerful database system that supports vector search and AI functions.", "Vector search can be used for semantic search to find similar content."]';
SELECT AI_RERANK("ob_rerank", @query, @candidate_docs) AS ranked_results;
The returned result contains the index that indicates the document index and the relevance_score that indicates the relevance score:
+-------------------------------------------------------------------------------------------------------------+
| ranked_results |
+-------------------------------------------------------------------------------------------------------------+
| [{"index": 1, "relevance_score": 0.9904329776763916}, {"index": 0, "relevance_score": 0.16993996500968933}] |
+-------------------------------------------------------------------------------------------------------------+
1 row in set
Generate Answer
Generate an answer based on the results obtained from steps 1 and 2:
SELECT AI_COMPLETE("ob_complete",
AI_PROMPT('Based on the following document content, answer the user's question.
User question: {0}
Relevant documents: {1}
Please provide a concise and accurate answer to the user's question based on the content of the above document.
The result returned is as follows:
+--------------------------------------------------------------------------------------------------------------------------------------------+
| answer |
+--------------------------------------------------------------------------------------------------------------------------------------------+
| Vector search is a semantic search technique that searches for similar content by comparing vector data. |
+--------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set
By following these three steps, you can complete the complete AI application process in the seekdb database: vectorization, search, reranking, and generating answers.
Summary
In this tutorial, you learned the core features of the SeekDB AI function service:
- AI_EMBED: Converts text into vectors for vector search.
- AI_COMPLETE: Directly calls an LLM in SQL to generate, translate, and analyze text.
- AI_RERANK: Optimizes search result accuracy for better RAG application performance.
What to do next
- Learn about vector search
- Explore the hybrid search feature
- View the semantic index to simplify the vector search process
More operations
For more information about how to experience the AI Native features of seekdb and how to build AI applications based on seekdb, see:
- Experience vector search
- Experience full-text indexing
- Experience hybrid search
- Experience semantic indexing
- Build a knowledge base desktop application based on seekdb
- Build a cultural and tourism assistant based on multi-model integration of seekdb
- Build an image search application based on seekdb
In addition to using SQL, you can also use the Python SDK (pyseekdb) provided by seekdb. For more information, see Experience embedded seekdb by using Python SDK and pyseekdb overview.