Skip to main content

Use cases and examples of AI functions

This topic describes the features of AI functions in seekdb.

AI functions integrate AI model capabilities directly into data processing within databases through SQL expressions. This greatly simplifies operations such as data extraction, analysis, summarization, and storage using large AI models, making it an important new feature in the fields of databases and data warehouses. seekdb provides AI model and endpoint management through the DBMS_AI_SERVICE package, introduces several built-in AI function expressions, and supports monitoring AI model usage through views.

Prerequisites

Before using AI functions, make sure you have the necessary privileges. For more information about the privileges, see AI function privileges.

Considerations

Hybrid search relies on the model management and embedding capabilities of the AI function service. Before deleting an AI model, check whether it is referenced by hybrid search to avoid potential issues.

AI model management

The DBMS_AI_SERVICE package provides the ability to manage AI models and endpoints. It supports the following operations:

OperationDescription
CREATE_AI_MODELCreates an AI model object.
DROP_AI_MODELDrops an AI model object.
CREATE_AI_MODEL_ENDPOINTCreates an AI model endpoint object.
ALTER_AI_MODEL_ENDPOINTModifies an AI model endpoint object.
DROP_AI_MODEL_ENDPOINTDrops an AI model endpoint object.

By using this system package, you can directly manage AI models and endpoints within seekdb, without relying on external services.

Monitor AI model usage

seekdb allows you to query and monitor information about AI models and their usage through the following views:

  • CDB/DBA_OB_AI_MODELS: Query information about AI models.
  • CDB/DBA_OB_AI_MODEL_ENDPOINTS: Monitor the calls of AI models.

AI function expressions

seekdb supports the following AI function expressions, allowing you to call AI models directly within seekdb using SQL statements and greatly simplifying the process:

FunctionDescription
AI_COMPLETECalls a specified text generation large language model (LLM) to process prompts and data, and then parses the results.
AI_PROMPTConstructs and formats prompts. Supports dynamic data insertion.
AI_EMBEDCalls an embedding model to convert text data into vector data.
AI_RERANKCalls a reranking model to sort text based on prompts by similarity.
info

When using AI function expressions, make sure you have registered the AI models and endpoint information in the database.

AI_COMPLETE and AI_PROMPT

The AI_COMPLETE function specifies a registered large language model (LLM) for text generation using the model_key, processes the user-provided prompt and data, and returns the text generated by the model. Users can customize the prompt and the format of the data from the database through the prompt parameter. This approach not only enables flexible processing of textual data, but also allows for batch processing directly within the database, effectively avoiding the overhead of repeatedly transferring data between the database and the language model.

In many AI application scenarios, prompts are often highly structured and require dynamic injection of specific data. Manually concatenating prompts and input content using functions like CONCAT each time is not only costly in terms of development, but also prone to formatting errors. To support prompt reuse and dynamic combination of prompts and data, seekdb provides the AI_PROMPT function. AI_PROMPT upgrades prompts from "static text" to a "reusable, parameterizable" function template format, which can be used directly in place of the prompt parameter within AI_COMPLETE. This greatly simplifies the process of constructing prompts, improving both development efficiency and accuracy.

AI_PROMPT function

The AI_PROMPT function is used to construct and format prompts, supporting dynamic data insertion.

Syntax

The syntax for the AI_PROMPT function is as follows:

AI_PROMPT('template', expr0 [ , expr1, ... ]);

Parameters:

ParameterDescriptionTypeNullable
templateThe prompt template entered by the user.VARCHAR(max_length)No
exprThe data entered by the user.VARCHAR(max_length)No

Both the template and expr parameters are required and cannot be null. The expr parameter only supports the VARCHAR type and does not support the JSON type.

Return value:

  • JSON, the formatted prompt string.
Examples

The AI_PROMPT function organizes the template string and dynamic data into JSON format:

  • The first parameter (the template string template) is placed in the template field of the returned JSON.
  • Subsequent parameters (data values expr0, expr1, ...) are placed in the args array of the returned JSON.
  • Placeholders in the template such as {0}, {1}, etc., correspond by index to the data in the args array and will be automatically replaced when used in the AI_COMPLETE function.

For example:

SELECT AI_PROMPT('Recommend {0} of the most popular {1} to me.', 'ten', 'mobile phones');

Return result:

{
"template": "Recommend {0} of the most popular {1} to me.",
"args": ["ten", "mobile phones"]
}

Based on the previous example, using the AI_PROMPT function within the AI_COMPLETE function:

SELECT AI_COMPLETE("ob_complete", AI_PROMPT('Recommend {0} of the most popular {1} to me. just output name in json array format', 'two', 'mobile phones')) AS ans;

Return result:

+--------------------------------------------------+
| ans |
+--------------------------------------------------+
| ["iPhone 15 Pro Max","Samsung Galaxy S24 Ultra"] |
+--------------------------------------------------+

AI_COMPLETE function

Syntax

The syntax for the AI_COMPLETE function is as follows:

AI_COMPLETE(model_key, prompt[, parameters])
-- If you use the AI_PROMPT function, replace the prompt parameter with the AI_PROMPT function. See the AI_PROMPT function example.
AI_COMPLETE(model_key, AI_PROMPT(prompt_template, data))

Parameters:

ParameterDescriptionTypeNullable
model_keyThe model registered in the database.VARCHAR(128)No
promptThe prompt provided by the user.VARCHAR/TEXT(LONGTEXT)No
parametersOptional configuration for the API, such as temperature, top_p, and max_tokens. These options vary by vendor and are added directly to the message body. Typically, you can use the default settings without specifying these options.JSONYes

Both model_key and prompt are required. If either is NULL, the function will return an error.

Return value:

  • text: The text generated by the LLM based on the prompt.
Examples
  1. Sentiment analysis example

    SELECT AI_COMPLETE("ob_complete","Your task is to perform sentiment analysis on the provided text and determine whether the sentiment is positive or negative.
    The text to analyze is as follows:
    <text>
    What a beautiful day!
    </text>
    Judgment criteria:
    If the text expresses a positive sentiment, output 1; if it expresses a negative sentiment, output -1. Do not output anything else.\n") AS ans;

    Result:

    +-----+
    | ans |
    +-----+
    | 1 |
    +-----+
  2. Translation example

    CREATE TABLE comments (
    id INT AUTO_INCREMENT PRIMARY KEY,
    content TEXT
    );

    INSERT INTO comments (content) VALUES ('hello world!');

    -- Use the concat expression to replace the processed data with column names from the database table, enabling batch processing of database data without copying data to and from the LLM.
    SELECT AI_COMPLETE("ob_complete",
    concat("You are a professional translator. Please translate the following English text into Chinese. The text to be translated is:<text>",
    content,
    "</text>")) AS ans FROM comments;

    Result:

    +-------------+
    | ans |
    +-------------+
    | 你好,世界! |
    +-------------+
  3. Classification example

    SELECT AI_COMPLETE("ob_complete","You are a classification expert. You will receive various issue texts and need to categorize them into the appropriate department. The department list is [\"Hardware\",\"Software\",\"Other\"]. The text to analyze is as follows:
    <text>
    The screen quality is terrible.
    </text>") AS res;

    Result:

    +--------+
    | res |
    +--------+
    | Hardware |
    +--------+

AI_EMBED

The AI_EMBED function uses the model_key parameter to specify a registered embedding model, which converts your text data into vector representations. If the model supports multiple dimensions, you can use the dim parameter to specify the output dimension.

Use AI_EMBED

Syntax:

AI_EMBED(model_key, input, [dim])

Parameters:

ParameterDescriptionTypeNullable
model_keyThe embedding model registered in your database.VARCHAR(128)No
inputThe text you want to convert into a vector.VARCHARNo
dimSpecifies the output dimension of the vector. Some model providers support configuring this value.INT64Yes

Both model_key and input are required. If either is NULL, the function will return an error.

Return value:

  • A string in vector format, that is, the embedding model’s vector representation of your text.

Examples

  1. Embed single row of data.

    SELECT AI_EMBED("ob_embed","Hello world") AS embedding;

    Result:

    +----------------+
    | embedding |
    +----------------+
    | [0.1, 0.2, 0.3]|
    +----------------+
  2. Embed table columns.

    CREATE TABLE comments (
    id INT AUTO_INCREMENT PRIMARY KEY,
    content TEXT
    );

    INSERT INTO comments (content) VALUES ('hello world!');

    SELECT AI_EMBED("ob_embed",content) AS embedding FROM comments;

    Result:

    +----------------+
    | embedding |
    +----------------+
    | [0.1, 0.2, 0.3]|
    +----------------+

AI_RERANK

The AI_RERANK function uses the model_key parameter to specify a registered reranking model. It organizes your query and document list according to the provider's rules, sends them to the specified model, and returns the sorted results. This function is suitable for reranking scenarios in Retrieval-Augmented Generation (RAG).

Use AI_RERANK

Syntax:

AI_RERANK(model_key, query, documents[, document_key])

Parameters:

ParameterDescriptionTypeNullable
model_keyThe reranking model registered in your database.VARCHAR(128)No
queryThe search text you want to use.VARCHAR(1024)No
documentsThe list of documents to be ranked.JSON array, for example, '["apple", "banana"]'No

All of the parameters model_key and input are required. If either is NULL, the function will return an error.

Return value:

  • A JSON array containing the documents and their relevance scores, sorted in descending order by relevance.

Examples

SELECT AI_RERANK("ob_rerank","Apple",'["apple","banana","fruit","vegetable"]');

Result:

+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ai_rerank("ob_rerank","Apple",'["apple","banana","fruit","vegetable"]') |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| [{"index": 0, "document": {"text": "apple"}, "relevance_score": 0.9912109375}, {"index": 1, "document": {"text": "banana"}, "relevance_score": 0.0033512115478515625}, {"index": 2, "document": {"text": "fruit"}, "relevance_score": 0.0003669261932373047}, {"index": 3, "document": {"text": "vegetable"}, "relevance_score": 0.00001996755599975586}] |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

References