Skip to main content

Use SQL to perform vector search

seekdb supports vector search using SQL. This topic describes how to perform vector search.

Quick start

Create a vector column and index

When you create a table, you can use the VECTOR(dim) data type to declare a column as a vector column and specify its dimension. Vector indexes must be created on vector columns and must include at least the type and distance parameters.

The following example creates a vector column named embedding with a dimension of 3 and creates an HNSW index on the embedding column, specifying the L2 distance algorithm.

CREATE TABLE t1(
id INT PRIMARY KEY,
doc VARCHAR(200),
embedding VECTOR(3),
VECTOR INDEX idx1(embedding) WITH (distance=L2, type=hnsw)
);

In scenarios with large amounts of data, it is recommended to import the data before creating the vector index.

Write vector data

To simulate a vector search scenario, construct some vector data. Each row of data includes a description of the data and the corresponding vector. For example, assume that the vector for 'apple' is '[1.2,0.7,1.1]' and the vector for 'carrot' is '[5.3,4.8,5.4]'.

INSERT INTO t1
VALUES (1, 'apple', '[1.2,0.7,1.1]'),
(2, 'banana', '[0.6,1.2,0.8]'),
(3, 'orange','[1.1,1.1,0.9]'),
(4, 'carrot', '[5.3,4.8,5.4]'),
(5, 'spinach', '[4.9,5.3,4.8]'),
(6, 'tomato','[5.2,4.9,5.1]');

For the sake of simplicity, this example uses 3-dimensional vectors, which are manually generated. In practical applications, you need to use an embedding model to generate vectors for real text, which can have hundreds or even thousands of dimensions.

You can query the table data to verify whether the data has been written.

SELECT * FROM t1;

The expected result is as follows:

+----+-----------+---------------+
| id | doc | embedding |
+----+-----------+---------------+
| 1 | apple | [1.2,0.7,1.1] |
| 2 | banana | [0.6,1.2,0.8] |
| 3 | orange | [1.1,1.1,0.9] |
| 4 | carrot | [5.3,4.8,5.4] |
| 5 | spinach | [4.9,5.3,4.8] |
| 6 | tomato | [5.2,4.9,5.1] |
+----+-----------+---------------+
6 rows in set

To perform vector search, provide a vector as the search condition. For example, if you want to find all 'fruits' with the vector [0.9, 1.0, 0.9], the corresponding SQL statement is:

SELECT id, doc FROM t1
ORDER BY l2_distance(embedding, '[0.9, 1.0, 0.9]')
APPROXIMATE LIMIT 3;

The expected result is as follows:

+----+--------+
| id | doc |
+----+--------+
| 3 | orange |
| 2 | banana |
| 1 | apple |
+----+--------+
3 rows in set

References