Skip to main content

Hybrid search with vector indexes

This topic describes hybrid search with full-text indexes and vector indexes in seekdb.

Hybrid search combines vector-based semantic search and full-text index-based keyword search, providing more accurate and comprehensive search results through integrated ranking. Vector search excels at semantic approximate matching but has weaker capabilities for matching exact keywords, numbers, and proper nouns. Full-text search effectively compensates for this limitation. Therefore, hybrid search has become a key feature of vector databases and is widely used in various products. seekdb achieves efficient hybrid queries by integrating its full-text and vector indexing capabilities.

Usage

The hybrid search feature is provided through the new system package DBMS_HYBRID_SEARCH, which contains 2 sub-functions:

Method nameDescription
DBMS_HYBRID_SEARCH.SEARCHReturns search results in JSON format. Results are sorted by relevance.
DBMS_HYBRID_SEARCH.GET_SQLReturns the actual executed SQL statement as a string.

Use cases and examples

Create example tables and insert data

To demonstrate the hybrid search feature, this section creates and inserts data into several example tables that will be used in different search scenarios below.

  • products table: A basic product information table used to demonstrate regular scalar search. It contains product ID, name, description, brand, category, tags, price, stock quantity, release date, on-sale status, and a vector field vec.

    CREATE TABLE products (
    `product_id` varchar(50) DEFAULT NULL,
    `product_name` varchar(255) DEFAULT NULL,
    `description` text DEFAULT NULL,
    `brand` varchar(100) DEFAULT NULL,
    `category` varchar(100) DEFAULT NULL,
    `tags` varchar(255) DEFAULT NULL,
    `price` decimal(10,2) DEFAULT NULL,
    `stock_quantity` int(11) DEFAULT NULL,
    `release_date` datetime DEFAULT NULL,
    `is_on_sale` tinyint(1) DEFAULT NULL,
    `vec` VECTOR(4) DEFAULT NULL
    );

    Insert data.

    INSERT INTO products VALUES
    ('prod-001', 'Gamer-Pro Mechanical Keyboard', 'A responsive mechanical keyboard with customizable RGB lighting for the ultimate gaming experience.',
    'GamerZone', 'Gaming', 'best-seller,gaming-gear,rgb', 149.00, 100, '2023-07-20 00:00:00.000000', 1, '[0.5,0.1,0.6,0.9]'),

    ('prod-002', 'Gamer-Pro Headset', 'High-fidelity gaming headset with a noise-cancelling microphone.',
    'GamerZone', 'Gaming', 'best-seller,gaming-gear,audio', 149.00, 100, '2023-07-20 00:00:00.000000', 1, '[0.1,0.9,0.2,0]'),

    ('prod-003', 'Eco-Friendly Yoga Mat', 'A non-slip yoga mat made from sustainable and eco-friendly materials.',
    'NatureFirst', 'Sports', 'eco-friendly,health', 49.99, 200, '2023-04-22 00:00:00.000000', 0, '[0.1,0.9,0.3,0]');
  • products_fulltext table: Based on the products table, full-text indexes are created on the product_name, description, and tags columns to demonstrate full-text search.

    CREATE TABLE products_fulltext (
    product_id VARCHAR(50),
    product_name VARCHAR(255),
    description TEXT,
    brand VARCHAR(100),
    category VARCHAR(100),
    tags VARCHAR(255),
    price DECIMAL(10, 2),
    stock_quantity INT,
    release_date DATETIME,
    is_on_sale TINYINT(1),
    vec vector(4),
    -- Create full-text indexes on columns that need full-text search
    FULLTEXT INDEX idx_product_name(product_name),
    FULLTEXT INDEX idx_description(description),
    FULLTEXT INDEX idx_tags(tags)
    );

    Insert data.

    INSERT INTO products_fulltext VALUES
    ('prod-001', 'Gamer-Pro Mechanical Keyboard', 'A responsive mechanical keyboard with customizable RGB lighting for the ultimate gaming experience.',
    'GamerZone', 'Gaming', 'best-seller,gaming-gear,rgb', 149.00, 100, '2023-07-20 00:00:00.000000', 1, '[0.5,0.1,0.6,0.9]'),

    ('prod-002', 'Gamer-Pro Headset', 'High-fidelity gaming headset with a noise-cancelling microphone.',
    'GamerZone', 'Gaming', 'best-seller,gaming-gear,audio', 149.00, 100, '2023-07-20 00:00:00.000000', 1, '[0.1,0.9,0.2,0]'),

    ('prod-003', 'Eco-Friendly Yoga Mat', 'A non-slip yoga mat made from sustainable and eco-friendly materials.',
    'NatureFirst', 'Sports', 'eco-friendly,health', 49.99, 200, '2023-04-22 00:00:00.000000', 0, '[0.1,0.9,0.3,0]');
  • doc_table table: A document table containing scalar columns, vector columns, and full-text indexed columns, used to demonstrate full-text search with scalar filtering conditions and hybrid search.

    CREATE TABLE doc_table(
    c1 INT,
    vector VECTOR(3),
    query VARCHAR(255),
    content VARCHAR(255),
    VECTOR INDEX idx1(vector) WITH (distance=l2, type=hnsw, lib=vsag),
    FULLTEXT INDEX idx2(query),
    FULLTEXT INDEX idx3(content)
    );

    Insert data.

    INSERT INTO doc_table VALUES
    (1, '[1,2,3]', "hello world", "oceanbase Elasticsearch database"),

    (2, '[1,2,1]', "hello world, what is your name", "oceanbase mysql database"),

    (3, '[1,1,1]', "hello world, how are you", "oceanbase oracle database"),

    (4, '[1,3,1]', "real world, where are you from", "postgres oracle database"),

    (5, '[1,3,2]', "real world, how old are you", "redis oracle database"),

    (6, '[2,1,1]', "hello world, where are you from", "starrocks oceanbase database");
  • products_vector table: Similar to the products table structure, but with a vector index explicitly created on the vec column to demonstrate pure vector search.

    CREATE TABLE products_vector (
    `product_id` varchar(50) DEFAULT NULL,
    `product_name` varchar(255) DEFAULT NULL,
    `description` text DEFAULT NULL,
    `brand` varchar(100) DEFAULT NULL,
    `category` varchar(100) DEFAULT NULL,
    `tags` varchar(255) DEFAULT NULL,
    `price` decimal(10,2) DEFAULT NULL,
    `stock_quantity` int(11) DEFAULT NULL,
    `release_date` datetime DEFAULT NULL,
    `is_on_sale` tinyint(1) DEFAULT NULL,
    `vec` VECTOR(4) DEFAULT NULL,
    -- Create a vector index on the column that needs vector search
    VECTOR INDEX idx1(vec) WITH (distance=l2, type=hnsw, lib=vsag)
    );

    Insert data.

    INSERT INTO products_vector VALUES
    ('prod-001', 'Gamer-Pro Mechanical Keyboard', 'A responsive mechanical keyboard with customizable RGB lighting for the ultimate gaming experience.',
    'GamerZone', 'Gaming', 'best-seller,gaming-gear,rgb', 149.00, 100, '2023-07-20 00:00:00.000000', 1, '[0.5,0.1,0.6,0.9]'),

    ('prod-002', 'Gamer-Pro Headset', 'High-fidelity gaming headset with a noise-cancelling microphone.',
    'GamerZone', 'Gaming', 'best-seller,gaming-gear,audio', 149.00, 100, '2023-07-20 00:00:00.000000', 1, '[0.1,0.9,0.2,0]'),

    ('prod-003', 'Eco-Friendly Yoga Mat', 'A non-slip yoga mat made from sustainable and eco-friendly materials.',
    'NatureFirst', 'Sports', 'eco-friendly,health', 49.99, 200, '2023-04-22 00:00:00.000000', 0, '[0.1,0.9,0.3,0]');
  • products_multi_vector table: A table containing multiple vector fields, used to demonstrate multi-vector search.

    CREATE TABLE products_multi_vector (
    product_id VARCHAR(50),
    product_name VARCHAR(255),
    description TEXT,
    vec1 VECTOR(4),
    vec2 VECTOR(4),
    vec3 VECTOR(4),
    VECTOR INDEX idx1(vec1) WITH (distance=l2, type=hnsw, lib=vsag),
    VECTOR INDEX idx2(vec2) WITH (distance=l2, type=hnsw, lib=vsag),
    VECTOR INDEX idx3(vec3) WITH (distance=l2, type=hnsw, lib=vsag)
    );

    Insert data.

    INSERT INTO products_multi_vector VALUES
    ('prod-001', 'Gamer-Pro Mechanical Keyboard', 'A responsive mechanical keyboard', '[0.5,0.1,0.6,0.9]', '[0.2,0.3,0.4,0.5]', '[0.1,0.2,0.3,0.4]'),
    ('prod-002', 'Gamer-Pro Headset', 'High-fidelity gaming headset', '[0.1,0.9,0.2,0]', '[0.3,0.4,0.5,0.6]', '[0.2,0.3,0.4,0.5]'),
    ('prod-003', 'Eco-Friendly Yoga Mat', 'A non-slip yoga mat', '[0.1,0.9,0.3,0]', '[0.4,0.5,0.6,0.7]', '[0.3,0.4,0.5,0.6]');

Some use cases for regular scalar search are as follows:

  • E-commerce platform product filtering: Users want to view all products from a specific brand. For example, users want to view all products from the GamerZone brand.
  • Content management systems: Administrators need to filter articles or documents by specific categories. For example, finding all articles by a specific author.
  • User management systems: Finding users with specific statuses or roles. For example, finding all VIP users.

Example:

  1. Set search parameters.

    SET @parm = '{
    "query": {
    "bool": {
    "must": [
    {"term": {"brand": "GamerZone"}}
    ]
    }
    }
    }';
  2. Search for all records where brand is "GamerZone".

    SELECT json_pretty(DBMS_HYBRID_SEARCH.SEARCH('products', @parm)); 

    The return result is as follows:

    +---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | json_pretty(DBMS_HYBRID_SEARCH.SEARCH('products', @parm)) |
    +---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | [
    {
    "vec": "[0.5,0.1,0.6,0.9]",
    "tags": "best-seller,gaming-gear,rgb",
    "brand": "GamerZone",
    "price": 149.00,
    "_score": 1,
    "category": "Gaming",
    "is_on_sale": 1,
    "product_id": "prod-004",
    "description": "A responsive mechanical keyboard with customizable RGB lighting for the ultimate gaming experience.",
    "product_name": "Gamer-Pro Mechanical Keyboard",
    "release_date": "2023-07-20 00:00:00.000000",
    "stock_quantity": 100
    },
    {
    "vec": "[0.1,0.9,0.2,0]",
    "tags": "best-seller,gaming-gear,audio",
    "brand": "GamerZone",
    "price": 149.00,
    "_score": 1,
    "category": "Gaming",
    "is_on_sale": 1,
    "product_id": "prod-009",
    "description": "High-fidelity gaming headset with a noise-cancelling microphone.",
    "product_name": "Gamer-Pro Headset",
    "release_date": "2023-07-20 00:00:00.000000",
    "stock_quantity": 100
    }
    ] |
    +---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    1 row in set

Range search for regular scalars

Some use cases for regular scalar range search are as follows:

  • Price range filtering: E-commerce platforms filter products by price range. For example, finding products with prices in the [30~80] range.
  • Time range queries: Finding orders or logs within a specific time period. For example, finding orders from the last 30 days.
  • Numeric range filtering: Filtering by rating, stock quantity, and other numeric ranges. For example, finding products with ratings between [4~5].

Example:

  1. Set search parameters.

    SET @parm = '{
    "query": {
    "range" : {
    "price" : {
    "gte" : 30,
    "lte" : 80
    }
    }
    }
    }';
  2. Search for all records where price is in the [30~80] range.

    SELECT json_pretty(DBMS_HYBRID_SEARCH.SEARCH('products', @parm)); 

    The return result is as follows:

    +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | json_pretty(DBMS_HYBRID_SEARCH.SEARCH('products', @parm)) |
    +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | [
    {
    "vec": "[0.1,0.9,0.3,0]",
    "tags": "eco-friendly,health",
    "brand": "NatureFirst",
    "price": 49.99,
    "_score": true,
    "category": "Sports",
    "is_on_sale": 0,
    "product_id": "prod-003",
    "description": "A non-slip yoga mat made from sustainable and eco-friendly materials.",
    "product_name": "Eco-Friendly Yoga Mat",
    "release_date": "2023-04-22 00:00:00.000000",
    "stock_quantity": 200
    }
    ] |
    +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    1 row in set

Some use cases for full-text search are as follows:

  • Document search: Searching for content containing specific keywords in a large number of documents. For example, searching for documents containing "how to use" in FAQs.
  • Product search: Fuzzy search based on product names and descriptions. For example, searching for products containing "database".
  • Knowledge base retrieval: Searching for related questions in FAQs and help documents. For example, searching for answers to related questions in a customer service system's knowledge base.

Example:

  1. Set search parameters.

    SET @query_str_with_mini = '{
    "query": {
    "query_string": {
    "type": "best_fields",
    "fields": ["product_name^3", "description^2.5", "tags^1.5"],
    "query": "Gamer-Pro^2 keyboard^1.5 audio^1.2",
    "boost": 1.5
    }
    }
    }';
  2. Search for records where the product_name, description, and tags fields contain the keywords "Gamer-Pro", "keyboard", and "audio", and sort them according to the set field and keyword weights.

    SELECT json_pretty(DBMS_HYBRID_SEARCH.SEARCH('products_fulltext', @query_str_with_mini));

    The return result is as follows:

    +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | json_pretty(DBMS_HYBRID_SEARCH.SEARCH('products_fulltext', @query_str_with_mini)) |
    +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | [
    {
    "vec": "[0.5,0.1,0.6,0.9]",
    "tags": "best-seller,gaming-gear,rgb",
    "brand": "GamerZone",
    "price": 149.00,
    "_score": 4.569735248749978,
    "category": "Gaming",
    "is_on_sale": 1,
    "product_id": "prod-001",
    "description": "A responsive mechanical keyboard with customizable RGB lighting for the ultimate gaming experience.",
    "product_name": "Gamer-Pro Mechanical Keyboard",
    "release_date": "2023-07-20 00:00:00.000000",
    "stock_quantity": 100
    },
    {
    "vec": "[0.1,0.9,0.2,0]",
    "tags": "best-seller,gaming-gear,audio",
    "brand": "GamerZone",
    "price": 149.00,
    "_score": 1.7338881172399914,
    "category": "Gaming",
    "is_on_sale": 1,
    "product_id": "prod-002",
    "description": "High-fidelity gaming headset with a noise-cancelling microphone.",
    "product_name": "Gamer-Pro Headset",
    "release_date": "2023-07-20 00:00:00.000000",
    "stock_quantity": 100
    }
    ] |
    +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    1 row in set

Full-text search with scalar filtering conditions

Some use cases for full-text search with scalar filtering conditions are as follows:

  • Precise search: Performing text search under specific conditions. For example, searching for specific keywords in articles with published status.
  • Permission control: Searching within data ranges that users have permission to access. For example, an order system searching for product information in orders within a specific time period.
  • Category search: Performing keyword search within specific categories. For example, a user system searching for specific user information among active users.

Example:

  1. Set search parameters.

    -- Filter condition: specify scalar filter condition c1 >= 2
    SET @query_str = '{
    "query": {
    "bool" : {
    "must" : [
    {"query_string": {
    "fields": ["query", "content"],
    "query": "hello what oceanbase mysql"}
    }
    ],
    "filter" : [
    {"range": {"c1": {"gte" : 2}}}
    ]
    }
    }
    }';
  2. Search for all records where c1 is greater than or equal to 2.

    SELECT json_pretty(DBMS_HYBRID_SEARCH.SEARCH('doc_table', @query_str));

    The return result is as follows:

    +---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | json_pretty(DBMS_HYBRID_SEARCH.SEARCH('doc_table', @query_str)) |
    +---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | [
    {
    "c1": 2,
    "query": "hello world, what is your name",
    "_score": 2.170969786679347,
    "vector": "[1,2,1]",
    "content": "oceanbase mysql database"
    },
    {
    "c1": 3,
    "query": "hello world, how are you",
    "_score": 0.3503184713375797,
    "vector": "[1,1,1]",
    "content": "oceanbase oracle database"
    },
    {
    "c1": 6,
    "query": "hello world, where are you from",
    "_score": 0.3503184713375797,
    "vector": "[2,1,1]",
    "content": "starrocks oceanbase database"
    }
    ] |
    +---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    1 row in set

Some use cases for vector search are as follows:

  • Semantic search: Finding related content based on semantic similarity. For example, finding semantically related questions and answers in a knowledge base.
  • Recommendation systems: Recommending similar products based on user preferences. For example, recommending similar products on e-commerce platforms.
  • Image search: Finding similar images through image features. For example, finding similar images in an image library.
  • Intelligent Q&A: Finding semantically related questions and answers in a knowledge base. For example, finding semantically related questions and answers in a customer service system's knowledge base.

Example:

  1. Set search parameters.

    -- field specifies the vector field, k specifies the number of results to return (the k nearest results), query_vector specifies the query vector
    SET @parm = '{
    "knn" : {
    "field": "vec",
    "k": 3,
    "query_vector": [0.5,0.1,0.6,0.9]
    }
    }';
  2. Search for all records where vec is similar to [0.5,0.1,0.6,0.9].

    SELECT json_pretty(DBMS_HYBRID_SEARCH.SEARCH('products_vector', @parm));

    The return result is as follows:

    +--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | json_pretty(DBMS_HYBRID_SEARCH.SEARCH('products_vector', @parm)) |
    +--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | [
    {
    "vec": "[0.5,0.1,0.6,0.9]",
    "tags": "best-seller,gaming-gear,rgb",
    "brand": "GamerZone",
    "price": 149.00,
    "_score": 1.0,
    "category": "Gaming",
    "is_on_sale": 1,
    "product_id": "prod-001",
    "description": "A responsive mechanical keyboard with customizable RGB lighting for the ultimate gaming experience.",
    "product_name": "Gamer-Pro Mechanical Keyboard",
    "release_date": "2023-07-20 00:00:00.000000",
    "stock_quantity": 100
    },
    {
    "vec": "[0.1,0.9,0.3,0]",
    "tags": "eco-friendly,health",
    "brand": "NatureFirst",
    "price": 49.99,
    "_score": 0.43405784,
    "category": "Sports",
    "is_on_sale": 0,
    "product_id": "prod-003",
    "description": "A non-slip yoga mat made from sustainable and eco-friendly materials.",
    "product_name": "Eco-Friendly Yoga Mat",
    "release_date": "2023-04-22 00:00:00.000000",
    "stock_quantity": 200
    },
    {
    "vec": "[0.1,0.9,0.2,0]",
    "tags": "best-seller,gaming-gear,audio",
    "brand": "GamerZone",
    "price": 149.00,
    "_score": 0.42910841,
    "category": "Gaming",
    "is_on_sale": 1,
    "product_id": "prod-002",
    "description": "High-fidelity gaming headset with a noise-cancelling microphone.",
    "product_name": "Gamer-Pro Headset",
    "release_date": "2023-07-20 00:00:00.000000",
    "stock_quantity": 100
    }
    ] |
    +--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    1 row in set

Vector search with scalar filtering conditions

Some use cases for vector search with scalar filtering conditions are as follows:

  • Precise search: Performing text search under specific conditions. For example, searching for specific keywords in articles with published status.
  • Permission control: Searching within data ranges that users have permission to access. For example, an order system searching for product information in orders within a specific time period.
  • Category search: Performing keyword search within specific categories. For example, a user system searching for specific user information among active users.

Example:

  1. Set search parameters.

    -- Specify scalar filter condition brand = "GamerZone"
    SET @parm = '{
    "knn" : {
    "field": "vec",
    "k": 3,
    "query_vector": [0.1,0.5,0.3,0.7],
    "filter" : [
    {"term" : {"brand": "GamerZone"} }
    ]
    }
    }';
  2. Search for all records where vec is similar to [0.1,0.5,0.3,0.7] and brand is "GamerZone".

    SELECT json_pretty(DBMS_HYBRID_SEARCH.SEARCH('products_vector', @parm));

    The return result is as follows:

    +---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | json_pretty(DBMS_HYBRID_SEARCH.SEARCH('products_vector', @parm)) |
    +---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | [
    {
    "vec": "[0.5,0.1,0.6,0.9]",
    "tags": "best-seller,gaming-gear,rgb",
    "brand": "GamerZone",
    "price": 149.00,
    "_score": 0.59850837,
    "category": "Gaming",
    "is_on_sale": 1,
    "product_id": "prod-001",
    "description": "A responsive mechanical keyboard with customizable RGB lighting for the ultimate gaming experience.",
    "product_name": "Gamer-Pro Mechanical Keyboard",
    "release_date": "2023-07-20 00:00:00.000000",
    "stock_quantity": 100
    },
    {
    "vec": "[0.1,0.9,0.2,0]",
    "tags": "best-seller,gaming-gear,audio",
    "brand": "GamerZone",
    "price": 149.00,
    "_score": 0.55175342,
    "category": "Gaming",
    "is_on_sale": 1,
    "product_id": "prod-002",
    "description": "High-fidelity gaming headset with a noise-cancelling microphone.",
    "product_name": "Gamer-Pro Headset",
    "release_date": "2023-07-20 00:00:00.000000",
    "stock_quantity": 100
    }
    ] |
    +---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    1 row in set

Multi-vector search refers to searching across multiple vector indexes and returning the most similar records.

Example:

  1. Set search parameters.

    -- Specify 3-way vector queries, each query specifies the vector index field, number of results to return, and query vector
    SET @param_multi_knn = '{
    "knn" : [{
    "field": "vec1",
    "k": 5,
    "query_vector": [0.5,0.1,0.6,0.9]
    },
    {
    "field": "vec2",
    "k": 5,
    "query_vector": [0.2,0.3,0.4,0.5]
    },
    {
    "field": "vec3",
    "k": 5,
    "query_vector": [0.1,0.2,0.3,0.4]
    }
    ],
    "size" : 5
    }';
  2. Execute the query and return the query results.

    SELECT json_pretty(DBMS_HYBRID_SEARCH.SEARCH('products_multi_vector', @param_multi_knn));
    +---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | json_pretty(DBMS_HYBRID_SEARCH.SEARCH('products_multi_vector', @param_multi_knn)) |
    +---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | [
    {
    "vec1": "[0.5,0.1,0.6,0.9]",
    "vec2": "[0.2,0.3,0.4,0.5]",
    "vec3": "[0.1,0.2,0.3,0.4]",
    "_score": 3.0,
    "product_id": "prod-001",
    "description": "A responsive mechanical keyboard",
    "product_name": "Gamer-Pro Mechanical Keyboard"
    },
    {
    "vec1": "[0.1,0.9,0.2,0]",
    "vec2": "[0.3,0.4,0.5,0.6]",
    "vec3": "[0.2,0.3,0.4,0.5]",
    "_score": 2.0957750699999997,
    "product_id": "prod-002",
    "description": "High-fidelity gaming headset",
    "product_name": "Gamer-Pro Headset"
    },
    {
    "vec1": "[0.1,0.9,0.3,0]",
    "vec2": "[0.4,0.5,0.6,0.7]",
    "vec3": "[0.3,0.4,0.5,0.6]",
    "_score": 1.86262927,
    "product_id": "prod-003",
    "description": "A non-slip yoga mat",
    "product_name": "Eco-Friendly Yoga Mat"
    }
    ] |
    +---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    1 row in set

Some use cases for full-text and vector hybrid search are as follows:

  • Intelligent search: Comprehensive search combining keywords and semantic understanding. For example, when a user inputs "I need a gaming keyboard", the system matches both the keywords "gaming" and "keyboard", and understands the semantics of "gaming equipment".
  • Document search: Supporting both exact keyword matching and semantic understanding in large document collections. For example, when searching for "database optimization", it matches documents containing these words and also finds semantically related content about "performance tuning" and "query optimization".
  • Product recommendation: E-commerce platforms support both product name search and requirement description search. For example, based on a user's description "laptop suitable for office work", it matches keywords and understands the semantic requirement of "business office".

Example:

  1. Set search parameters.

    SET @parm = '{
    "query": {
    "bool": {
    "should": [
    {"match": {"query": "hi hello"}},
    {"match": { "content": "oceanbase mysql" }}
    ]
    }
    },
    "knn" : {
    "field": "vector",
    "k": 5,
    "query_vector": [1,2,3]
    },
    "_source" : ["query", "content", "_keyword_score", "_semantic_score"]
    }';
  2. Execute the query and return the query results.

    SELECT json_pretty(DBMS_HYBRID_SEARCH.SEARCH('doc_table', @parm)); 

    The return result is as follows:

    +---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | json_pretty(dbms_hybrid_search.search('doc_table', @parm)) |
    +---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | [
    {
    "query": "hello world, what is your name",
    "_score": 2.835628417884166,
    "content": "oceanbase mysql database",
    "_keyword_score": 2.5022950878841663,
    "_semantic_score": 0.33333333
    },
    {
    "query": "hello world",
    "_score": 1.7219400929592013,
    "content": "oceanbase Elasticsearch database",
    "_keyword_score": 0.7219400929592014,
    "_semantic_score": 1.0
    },
    {
    "query": "hello world, how are you",
    "_score": 1.0096539326751595,
    "content": "oceanbase oracle database",
    "_keyword_score": 0.7006369426751594,
    "_semantic_score": 0.30901699
    },
    {
    "query": "real world, how old are you",
    "_score": 0.41421356,
    "content": "redis oracle database",
    "_keyword_score": null,
    "_semantic_score": 0.41421356
    },
    {
    "query": "real world, where are you from",
    "_score": 0.30901699,
    "content": "postgres oracle database",
    "_keyword_score": null,
    "_semantic_score": 0.30901699
    }
    ] |

The result sets of full-text sub-queries and vector sub-queries use weighted fusion by default. You can configure the fusion method to RRF (Reciprocal Rank Fusion) ranking fusion through the Rank syntax. Some use cases are as follows:

  • Multi-dimensional ranking: Requiring comprehensive consideration of results from multiple search dimensions. For example, in academic search systems, when searching in a paper database, both keyword matching degree and semantic relevance need to be considered.
  • Fairness requirements: Ensuring that results from different search methods are reasonably displayed. For example, on e-commerce platforms, both textual information such as product titles and descriptions, and visual information such as product images and videos need to be considered.
  • Complex queries: Complex search scenarios involving multiple query conditions. For example, in medical systems, both patient symptom descriptions and patient medical history and examination results need to be considered.

Example:

Set search parameters.

SET @rrf_query_param = '{
"query": {
"query_string": {
"fields": ["title", "author", "description"],
"query": "fiction American Dream"
}
},
"knn" : {
"field": "vector_embedding",
"k": 5,
"query_vector": [0.1, 0.2, 0.3, 0.4]
},
"rank" : {
"rrf" : {
"rank_window_size" : 10,
"rank_constant" : 60
}
}
}';

The RRF algorithm calculates the final relevance score by fusing the rankings of multiple sub-query result sets. The calculation formula is as follows:

score = 0.0
for q in queries:
if d in result(q):
score += 1.0 / ( k + rank( result(q), d ) ) # K constant is the configured rank_constant
return score

Summary

The examples in this topic demonstrate the powerful application value of the hybrid search feature:

  • Intelligent search upgrade: Integrating semantic understanding into traditional keyword search to provide more accurate search results that better match user intent.
  • Optimized user experience: Supporting natural language queries, simplifying operations, and improving information retrieval efficiency.
  • Empowering diverse businesses: Widely applied in scenarios such as e-commerce, content management, knowledge bases, and intelligent customer service, achieving comprehensive coverage from basic filtering to intelligent recommendations.
  • Combined technical advantages: Combining exact matching with semantic understanding to significantly improve the accuracy and comprehensiveness of search results.

The hybrid search feature is an ideal choice for processing massive unstructured data and building intelligent search and recommendation systems.