Store vector data
This topic introduces how to store unstructured, semi-structured, and structured data in a unified way within seekdb. This not only fully leverages the foundational capabilities of seekdb, but also provides strong support for hybrid search.
How it works
seekdb can store data of different modalities and supports hybrid search by converting various types of data (such as text, images, and videos) into vectors. Searches are performed by calculating the distances between these vectors. Hybrid search can be divided into two types: simple search, which is based on similarity search for a single vector, and complex search, which involves combining vector and scalar searches.
Since vector search is inherently approximate, it is necessary to employ multiple techniques in practical applications to improve accuracy. Only precise search results can deliver greater value to your business.
Create a vector column
The following example shows a table that stores vector data, spatial data, and relational data. The data type of the vector column is VECTOR, and the dimension must be specified when the column is created. The maximum supported dimension is 16,000. The data type of the spatial column is GEOMETRY:
CREATE TABLE t (
-- Store relational data (structured data).
id INT PRIMARY KEY,
-- Store spatial data (semi-structured data).
g GEOMETRY,
-- Store vector data (unstructured data).
vec VECTOR(3)
);
Use the INSERT statement to insert vector data
Once you create a table that contains a column of the VECTOR data type, you can directly use the INSERT statement to insert vectors into the table. When you insert data, the vector must match the dimension specified when the table is created. Otherwise, an error will be returned. This design ensures data consistency and query efficiency. Vectors are represented in standard floating-point number arrays. Each dimension must have a valid floating-point number. Here is a simple example:
INSERT INTO t (id, g, vec) VALUES (
-- Insert structured data.
1,
-- Insert semi-structured data.
ST_GeomFromText('POINT(1 1)'),
-- Insert unstructured data.
'[0.1, 0.2, 0.3]'
);