Skip to main content
Version: V1.0.0

Get started with SeekDB AI Function Service

This tutorial will guide you through the process of getting started with SeekDB AI Function Service, helping you understand how to leverage AI capabilities for practical applications and experience the power of AI-native databases within minutes.

Overview

AI functions integrate AI model capabilities directly into data processing within the database using SQL expressions. This feature significantly simplifies data extraction, analysis, summarization, and storage operations using large AI models, making it a crucial new feature in the database and data warehouse domain. seekdb provides comprehensive AI model and endpoint management through the DBMS_AI_SERVICE package, includes built-in AI function expressions, and supports monitoring AI model calls via views. You can directly call AI models in SQL without additional coding, and experience core functions like AI_COMPLETE, AI_EMBED, AI_RERANK, and AI_PROMPT within minutes:

  • AI_EMBED: Converts text data into vector data by calling an embedding model.
  • AI_COMPLETE: Processes prompts and data using a specified large text generation model and parses the results.
  • AI_PROMPT: Organizes prompt templates and dynamic data into JSON format, which can be directly used in the AI_COMPLETE function by replacing the prompt parameter.
  • AI_RERANK: Sorts text based on similarity using a reranking model.

This feature is applicable to scenarios such as text generation, text conversion, and text reranking.

Prerequisites

  • Contact the administrator to obtain the corresponding database connection string and execute the following command to connect to the database:

    # host: the IP address for connecting to the seekdb database.
    # port: the port for connecting to the seekdb database.
    # database_name: the name of the database to be accessed.
    # user_name: the username of the database.
    # password: the password of the database.
    mysql -h$host -P$port -u$user_name -p$password -D$database_name
  • Ensure that you have the AI function call permission. Each example provides complete model and endpoint registration information, which you can directly copy and use.

Step 1: Use AI_EMBED to generate vectors

The AI_EMBED function can convert text into vectors for vector search. This is the foundational step in vector search, where text data is transformed into high-dimensional vector representations to facilitate similarity calculations.

Register an embedding model and endpoint

This example demonstrates how to register an embedding model and endpoint for Alibaba Cloud (OpenAI-compatible format).

CALL DBMS_AI_SERVICE.DROP_AI_MODEL ('ob_embed');
CALL DBMS_AI_SERVICE.DROP_AI_MODEL_ENDPOINT ('ob_embed_endpoint');

CALL DBMS_AI_SERVICE.CREATE_AI_MODEL(
'ob_embed', '{
"type": "dense_embedding",
"model_name": "BAAI/bge-m3"
}');

CALL DBMS_AI_SERVICE.CREATE_AI_MODEL_ENDPOINT (
'ob_embed_endpoint', '{
"ai_model_name": "ob_embed",
"url": "https://dashscope.aliyuncs.com/compatible-mode/v1/embeddings",
-- Replace with your actual access key
"access_key": "sk-xxxxxxxxxxxxxxxxxxxxxxxxxxx",
"provider": "aliyun-openAI"
}');

Try embedding single-line data

SELECT AI_EMBED("ob_embed", "Hello world") AS embedding;

The expected result is a vector array, such as [0.1, 0.2, 0.3]. This allows you to convert text in the table into vectors for subsequent vector search operations.

Step 2: Use AI_COMPLETE and AI_PROMPT to generate text

You can directly call a large language model in SQL to generate, translate, and analyze text. The AI_PROMPT function organizes a prompt template and dynamic data into a JSON format, which you can use to replace the prompt parameter of the AI_COMPLETE function.

Register a text generation model and endpoint

This example shows how to register a text generation model and endpoint for Aliyun (OpenAI-compatible format).

CALL DBMS_AI_SERVICE.DROP_AI_MODEL ('ob_complete');
CALL DBMS_AI_SERVICE.DROP_AI_MODEL_ENDPOINT ('ob_complete_endpoint');

CALL DBMS_AI_SERVICE.CREATE_AI_MODEL(
'ob_complete', '{
"type": "completion",
"model_name": "THUDM/GLM-4-9B-0414"
}');

CALL DBMS_AI_SERVICE.CREATE_AI_MODEL_ENDPOINT (
'ob_complete_endpoint', '{
"ai_model_name": "ob_complete",
"url": "https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions",
-- Replace this value with your actual access_key.
"access_key": "sk-xxxxxxxxxxxxxxxxxxxxxxxxxxx",
"provider": "aliyun-openAI"
}');

Try sentiment analysis

SELECT AI_COMPLETE("ob_complete", AI_PROMPT('Your task is to perform sentiment analysis on the provided text and determine whether the sentiment is positive or negative.
The text to be analyzed is as follows:
<text>
{0}
</text>
The judgment criteria are as follows:
If the text expresses a positive sentiment, output 1; if the text expresses a negative sentiment, output -1. Do not output anything else.', 'The weather is really nice')) AS sentiment;

The returned result is as follows:

+----------+
| sentiment|
+----------+
| 1 |
+----------+

Step 3: Use AI_RERANK to optimize search results

The AI_RERANK function can be used to re-sort search results based on the relevance of the query terms.

Register the reranking model and endpoint

This example demonstrates how to register the reranking model and endpoint for Aliyun DashScope (non-compliant with the OpenAI format).

CALL DBMS_AI_SERVICE.DROP_AI_MODEL ('ob_rerank');
CALL DBMS_AI_SERVICE.DROP_AI_MODEL_ENDPOINT ('ob_rerank_endpoint');

CALL DBMS_AI_SERVICE.CREATE_AI_MODEL(
'ob_rerank', '{
"type": "rerank",
"model_name": "BAAI/bge-reranker-v2-m3"
}');

CALL DBMS_AI_SERVICE.CREATE_AI_MODEL_ENDPOINT (
'ob_rerank_endpoint', '{
"ai_model_name": "ob_rerank",
"url": "https://dashscope.aliyuncs.com/api/v1/services/rerank/text-rerank/text-rerank",
-- Replace with your actual access key.
"access_key": "sk-xxxxxxxxxxxxxxxxxxxxxxxxxxx",
"provider": "aliyun-dashscope"
}');

Try reranking

SELECT AI_RERANK("ob_rerank", "Apple", '["apple", "banana", "fruit", "vegetable"]');

The returned result is as follows:

+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| AI_RERANK("ob_rerank", "Apple", '["apple", "banana", "fruit", "vegetable"]') |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| [{"index": 0, "relevance_score": 0.9911285638809204}, {"index": 1, "relevance_score": 0.0030552432872354984}, {"index": 2, "relevance_score": 0.0003349370090290904}, {"index": 3, "relevance_score": 0.00001892922773549799}] |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set

Reranking can significantly improve the accuracy of search results, especially in RAG scenarios.

Step 4: Build an intelligent Q&A system

In this section, you will learn to build a simple AI chatbot by using three AI functions.

Register all required models and endpoints

This example uses the embedding model, text generation model, and reranking model in sequence. Ensure that you have registered the corresponding models and endpoints. For more information, see Step 1, Step 2, and Step 3.

info

If you have already registered the corresponding model, skip the corresponding registration step and proceed to the next step.

Prepare data and generate vectors

CREATE TABLE knowledge_base (
id INT AUTO_INCREMENT PRIMARY KEY,
title VARCHAR(255),
content TEXT,
embedding TEXT
);

INSERT INTO knowledge_base (title, content) VALUES
('About seekdb', 'seekdb is a powerful database system that supports vector search and AI functions.'),
('Vector search', 'Vector search can be used for semantic search to find similar content.'),
('AI functions', 'AI models can be directly called in SQL statements.');</numerusform>

UPDATE knowledge_base
SET embedding = AI_EMBED("ob_embed", content);

Perform vector search and reranking

SET @query = "What is vector search?";
SET @query_vector = AI_EMBED("ob_embed", @query);

-- Build a document list in an array of strings.
SET @candidate_docs = '["SeekDB is a powerful database system that supports vector search and AI functions.", "Vector search can be used for semantic search to find similar content."]';

SELECT AI_RERANK("ob_rerank", @query, @candidate_docs) AS ranked_results;

The returned result contains the index that indicates the document index and the relevance_score that indicates the relevance score:

+-------------------------------------------------------------------------------------------------------------+
| ranked_results |
+-------------------------------------------------------------------------------------------------------------+
| [{"index": 1, "relevance_score": 0.9904329776763916}, {"index": 0, "relevance_score": 0.16993996500968933}] |
+-------------------------------------------------------------------------------------------------------------+
1 row in set

Generate Answer

Generate an answer based on the results obtained from steps 1 and 2:

SELECT AI_COMPLETE("ob_complete",
AI_PROMPT('Based on the following document content, answer the user's question.
User question: {0}

Relevant documents: {1}

Please provide a concise and accurate answer to the user's question based on the content of the above document.

The result returned is as follows:

+--------------------------------------------------------------------------------------------------------------------------------------------+
| answer |
+--------------------------------------------------------------------------------------------------------------------------------------------+
| Vector search is a semantic search technique that searches for similar content by comparing vector data. |
+--------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set

By following these three steps, you can complete the complete AI application process in the seekdb database: vectorization, search, reranking, and generating answers.

Summary

In this tutorial, you learned the core features of the SeekDB AI function service:

  • AI_EMBED: Converts text into vectors for vector search.
  • AI_COMPLETE: Directly calls an LLM in SQL to generate, translate, and analyze text.
  • AI_RERANK: Optimizes search result accuracy for better RAG application performance.

What to do next

More operations

For more information about how to experience the AI Native features of seekdb and how to build AI applications based on seekdb, see:

In addition to using SQL, you can also use the Python SDK (pyseekdb) provided by seekdb. For more information, see Experience embedded seekdb by using Python SDK and pyseekdb overview.