Skip to main content

Experience AI function service in seekdb

This tutorial guides you through getting started with seekdb's AI function service, helping you understand how it leverages AI capabilities, understand practical applications, and experience the powerful features of an AI-native database.

Overview

AI functions integrate AI model capabilities directly into data processing within the database through SQL expressions. They greatly simplify operations such as data extraction, analysis, summarization, and storage using AI large models, and are an important new feature in the current database and data warehouse field. seekdb provides comprehensive AI model and endpoint management through the DBMS_AI_SERVICE package, and includes multiple built-in AI function expressions, while supporting monitoring of AI model calls through views. You can directly call AI models in SQL without writing additional code, and experience several core functions including AI_COMPLETE, AI_EMBED, AI_RERANK, and AI_PROMPT in just a few minutes:

  • AI_EMBED: Converts text data to vector data by calling an embedding model.
  • AI_COMPLETE: Processes prompts and data information by calling a specified text generation large model and parses the processing results.
  • AI_PROMPT: Organizes prompt templates and dynamic data into JSON format, which can be used directly in the AI_COMPLETE function to replace the prompt parameter.
  • AI_RERANK: Ranks text by similarity according to prompts by calling a rerank model.

This feature can be applied to text generation, text conversion, text reranking, and other scenarios.

Prerequisites

  • Contact the administrator to obtain the corresponding database connection string, then execute the following command to connect to the database:
    # host: seekdb database connection IP.
    # port: seekdb database connection port.
    # database_name: Name of the database to access.
    # user_name: Database username.
    # password: Database password.
    obclient -h$host -P$port -u$user_name -p$password -D$database_name
  • Ensure that you have the relevant permissions for AI function service. Complete model and endpoint registration information is provided before each example, which you can copy and use directly.

Step 1: Use AI_EMBED to generate vectors

AI_EMBED can convert text to vectors for vector retrieval. This is a fundamental step in vector retrieval, converting text data into high-dimensional vector representations for similarity calculations.

Register embedding model and endpoint

CALL DBMS_AI_SERVICE.DROP_AI_MODEL ('ob_embed');
CALL DBMS_AI_SERVICE.DROP_AI_MODEL_ENDPOINT ('ob_embed_endpoint');

CALL DBMS_AI_SERVICE.CREATE_AI_MODEL(
'ob_embed', '{
"type": "dense_embedding",
"model_name": "BAAI/bge-m3"
}');

CALL DBMS_AI_SERVICE.CREATE_AI_MODEL_ENDPOINT (
'ob_embed_endpoint', '{
"ai_model_name": "ob_embed",
"url": "https://api.siliconflow.cn/v1/embeddings",
-- Replace with actual access_key
"access_key": "sk-xxxxxxxxxxxxxxxxxxxxxxxxxxx",
"provider": "siliconflow"
}');

Try embedding a single row of data

SELECT AI_EMBED("ob_embed", "Hello world") AS embedding;

The expected result is a vector array, such as [0.1, 0.2, 0.3]. This allows you to batch convert text in tables to vectors for subsequent vector retrieval.

Step 2: Use AI_COMPLETE and AI_PROMPT to generate text

AI_COMPLETE can directly call large language models in SQL to implement text generation, translation, analysis, and other functions. The AI_PROMPT function can organize prompt templates and dynamic data into JSON format, which can be used directly in the AI_COMPLETE function to replace the prompt parameter.

Register text generation model and endpoint

CALL DBMS_AI_SERVICE.DROP_AI_MODEL ('ob_complete');
CALL DBMS_AI_SERVICE.DROP_AI_MODEL_ENDPOINT ('ob_complete_endpoint');

CALL DBMS_AI_SERVICE.CREATE_AI_MODEL(
'ob_complete', '{
"type": "completion",
"model_name": "THUDM/GLM-4-9B-0414"
}');

CALL DBMS_AI_SERVICE.CREATE_AI_MODEL_ENDPOINT (
'ob_complete_endpoint', '{
"ai_model_name": "ob_complete",
"url": "https://api.siliconflow.cn/v1/chat/completions",
-- Replace with actual access_key
"access_key": "sk-xxxxxxxxxxxxxxxxxxxxxxxxxxx",
"provider": "siliconflow"
}');

Try sentiment analysis

SELECT AI_COMPLETE("ob_complete", AI_PROMPT('Your task is to perform sentiment analysis on the provided text and determine whether its emotional tendency is positive or negative.
The following is the text to be analyzed:
<text>
{0}
</text>
The judgment criteria are as follows:
If the text expresses positive emotions, output 1; if the text expresses negative emotions, output -1. Do not output anything else.', 'The weather is really good.')) AS sentiment;

The following result is returned:

+----------+
| sentiment|
+----------+
| 1 |
+----------+

Step 3: Use AI_RERANK to optimize retrieval results

AI_RERANK can intelligently rerank retrieval results, reordering document lists by relevance to query terms.

Register rerank model and endpoint

CALL DBMS_AI_SERVICE.DROP_AI_MODEL ('ob_rerank');
CALL DBMS_AI_SERVICE.DROP_AI_MODEL_ENDPOINT ('ob_rerank_endpoint');

CALL DBMS_AI_SERVICE.CREATE_AI_MODEL(
'ob_rerank', '{
"type": "rerank",
"model_name": "BAAI/bge-reranker-v2-m3"
}');

CALL DBMS_AI_SERVICE.CREATE_AI_MODEL_ENDPOINT (
'ob_rerank_endpoint', '{
"ai_model_name": "ob_rerank",
"url": "https://api.siliconflow.cn/v1/rerank",
-- Replace with actual access_key
"access_key": "sk-xxxxxxxxxxxxxxxxxxxxxxxxxxx",
"provider": "siliconflow"
}');

Try reranking

SELECT AI_RERANK("ob_rerank", "Apple", '["apple", "banana", "fruit", "vegetable"]');

The following result is returned:

+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| AI_RERANK("ob_rerank", "Apple", '["apple", "banana", "fruit", "vegetable"]') |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| [{"index": 0, "relevance_score": 0.9911285638809204}, {"index": 1, "relevance_score": 0.0030552432872354984}, {"index": 2, "relevance_score": 0.0003349370090290904}, {"index": 3, "relevance_score": 0.00001892922773549799}] |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set

Reranking can significantly improve the accuracy of retrieval results, especially suitable for RAG scenarios.

Step 4: Comprehensive application: Build an intelligent Q&A system

Combine the three AI functions to build a simple intelligent Q&A system in three steps.

Register all required models and endpoints

This example requires the use of embedding models, text generation models, and rerank models simultaneously. Ensure that the following models and endpoints are registered:

-- Register embedding model (skip if already registered in Step 1)
CALL DBMS_AI_SERVICE.DROP_AI_MODEL ('ob_embed');
CALL DBMS_AI_SERVICE.DROP_AI_MODEL_ENDPOINT ('ob_embed_endpoint');

CALL DBMS_AI_SERVICE.CREATE_AI_MODEL(
'ob_embed', '{
"type": "dense_embedding",
"model_name": "BAAI/bge-m3"
}');

CALL DBMS_AI_SERVICE.CREATE_AI_MODEL_ENDPOINT (
'ob_embed_endpoint', '{
"ai_model_name": "ob_embed",
"url": "https://api.siliconflow.cn/v1/embeddings",
"access_key": "sk-xxxxxxxxxxxxxxxxxxxxxxxxxxx",
"provider": "siliconflow"
}');

-- Register text generation model (skip if already registered in Step 2)
CALL DBMS_AI_SERVICE.DROP_AI_MODEL ('ob_complete');
CALL DBMS_AI_SERVICE.DROP_AI_MODEL_ENDPOINT ('ob_complete_endpoint');

CALL DBMS_AI_SERVICE.CREATE_AI_MODEL(
'ob_complete', '{
"type": "completion",
"model_name": "THUDM/GLM-4-9B-0414"
}');

CALL DBMS_AI_SERVICE.CREATE_AI_MODEL_ENDPOINT (
'ob_complete_endpoint', '{
"ai_model_name": "ob_complete",
"url": "https://api.siliconflow.cn/v1/chat/completions",
"access_key": "sk-xxxxxxxxxxxxxxxxxxxxxxxxxxx",
"provider": "siliconflow"
}');

-- Register rerank model (skip if already registered in Step 3)
CALL DBMS_AI_SERVICE.DROP_AI_MODEL ('ob_rerank');
CALL DBMS_AI_SERVICE.DROP_AI_MODEL_ENDPOINT ('ob_rerank_endpoint');

CALL DBMS_AI_SERVICE.CREATE_AI_MODEL(
'ob_rerank', '{
"type": "rerank",
"model_name": "BAAI/bge-reranker-v2-m3"
}');

CALL DBMS_AI_SERVICE.CREATE_AI_MODEL_ENDPOINT (
'ob_rerank_endpoint', '{
"ai_model_name": "ob_rerank",
"url": "https://api.siliconflow.cn/v1/rerank",
"access_key": "sk-xxxxxxxxxxxxxxxxxxxxxxxxxxx",
"provider": "siliconflow"
}');
info

Replace all access_key values with actual API keys. If you have already registered the corresponding models in the previous steps, you can skip the corresponding registration steps.

Prepare data and generate vectors

CREATE TABLE knowledge_base (
id INT AUTO_INCREMENT PRIMARY KEY,
title VARCHAR(255),
content TEXT,
embedding TEXT
);

INSERT INTO knowledge_base (title, content) VALUES
('seekdb Introduction', 'seekdb is a powerful database system that supports vector retrieval and AI functions.'),
('Vector Retrieval', 'Vector retrieval can be used for semantic search to find similar content.'),
('AI Functions', 'AI functions can directly call AI models in SQL.');

UPDATE knowledge_base
SET embedding = AI_EMBED("ob_embed", content);

Vector retrieval and reranking

SET @query = "What is vector retrieval?";
SET @query_vector = AI_EMBED("ob_embed", @query);

-- Directly construct a document list in string array format
SET @candidate_docs = '["seekdb is a powerful database system that supports vector retrieval and AI functions.", "Vector retrieval can be used for semantic search to find similar content."]';

SELECT AI_RERANK("ob_rerank", @query, @candidate_docs) AS ranked_results;

The following result is returned. index is the document index, and relevance_score is the relevance score:

+-------------------------------------------------------------------------------------------------------------+
| ranked_results |
+-------------------------------------------------------------------------------------------------------------+
| [{"index": 1, "relevance_score": 0.9904329776763916}, {"index": 0, "relevance_score": 0.16993996500968933}] |
+-------------------------------------------------------------------------------------------------------------+
1 row in set

Generate answers

Based on the question retrieval in the first step and the reranking results in the second step, generate an answer:

SELECT AI_COMPLETE("ob_complete",
AI_PROMPT('Based on the following document content, answer the user's question.
User question: {0}

Relevant document: {1}

Please answer the user's question concisely and accurately based on the above document content.', @query, CAST(JSON_EXTRACT(@candidate_docs, '$[1]') AS CHAR))) AS answer;

The following result is returned:

+--------------------------------------------------------------------------------------------------------------------------------------------+
| answer |
+--------------------------------------------------------------------------------------------------------------------------------------------+
| According to the provided document content, vector retrieval is a technology used for semantic search, aimed at finding similar content by comparing vector data. |
+--------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set

Through these three steps, you can quickly complete a complete AI application flow within the seekdb database: vectorization, retrieval, reranking, and answer generation.

Summary

Through this tutorial, you have mastered the core features of seekdb's AI function service:

  • AI_EMBED: Convert text to vectors to prepare data for vector retrieval.
  • AI_COMPLETE: Directly call LLMs in SQL to implement text generation, translation, analysis, and other functions.
  • AI_RERANK: Optimize the accuracy of retrieval results and improve RAG application effectiveness.

What's next

More information

For more guides on experiencing seekdb's AI Native features and building AI applications based on seekdb, see:

In addition to using SQL for operations, you can also use the Python SDK (pyseekdb) provided by seekdb. For usage instructions, see Experience embedded seekdb using Python SDK and pyseekdb overview.