Use SQL functions
This topic describes the vector functions supported by seekdb and the considerations for using them.
Considerations
-
Vectors with different dimensions are not allowed to perform the following operations. An error
different vector dimensions %d and %dis returned. -
When the result exceeds the floating-point number range, an error
value out of range: overflow / underflowis returned. -
Dense vector indexes support L2, inner product, and cosine distance as index distance algorithms. Memory-based sparse vector indexes support inner product as the distance algorithm. For details, see Create vector indexes.
-
Vector index search supports calling the
L2_distance,Cosine_distance, andInner_productdistance functions in this document.
Distance functions
Distance functions are used to calculate the distance between two vectors. The calculation method varies depending on the distance algorithm used.
L2_distance
Euclidean distance reflects the distance between the coordinates of the compared vectors -- essentially the straight-line distance between two vectors. It is calculated by applying the Pythagorean theorem to vector coordinates:

The function syntax is as follows:
l2_distance(vector v1, vector v2)
The parameters are described as follows:
-
Apart from the vector type, other types that can be forcibly converted to the vector type are accepted, including single-level array types (such as
[1,2,3,..]) and string types (such as'[1,2,3]'). -
The dimensions of the two parameters must be the same.
-
If a single-level array type parameter exists, its elements cannot be
NULL.
The return values are described as follows:
-
The return value is a
distance(double)distance value. -
If any parameter is
NULL, the return value isNULL.
Here is an example:
CREATE TABLE t1(c1 vector(3));
INSERT INTO t1 VALUES('[1,2,3]');
SELECT l2_distance(c1, [1,2,3]), l2_distance([1,2,3],[1,1,1]), l2_distance('[1,1,1]','[1,2,3]') FROM t1;
The return result is as follows:
+--------------------------+------------------------------+----------------------------------+
| l2_distance(c1, [1,2,3]) | l2_distance([1,2,3],[1,1,1]) | l2_distance('[1,1,1]','[1,2,3]') |
+--------------------------+------------------------------+----------------------------------+
| 0 | 2.23606797749979 | 2.23606797749979 |
+--------------------------+------------------------------+----------------------------------+
1 row in set
L2_squared
L2 Squared distance is the square of the Euclidean distance (L2 Distance). It omits the square root operation in the Euclidean distance formula, thereby reducing computational cost while maintaining the relative order of distances. The calculation method is as follows:

The syntax is as follows:
l2_squared(vector v1, vector v2)
The parameters are described as follows:
-
Apart from the vector type, other types that can be forcibly converted to the vector type are accepted, including single-level array types (such as
[1,2,3,..]) and string types (such as'[1,2,3]'). -
The dimensions of the two parameters must be the same.
-
If a single-level array type parameter exists, its elements cannot be
NULL.
The return values are described as follows:
-
The return value is a
distance(double)distance value. -
If any parameter is
NULL, the return value isNULL.
Here is an example:
CREATE TABLE t1(c1 vector(3));
INSERT INTO t1 VALUES('[1,2,3]');
SELECT l2_squared(c1, [1,2,3]), l2_squared([1,2,3],[1,1,1]), l2_squared('[1,1,1]','[1,2,3]') FROM t1;
The return result is as follows:
+-------------------------+-----------------------------+---------------------------------+
| l2_squared(c1, [1,2,3]) | l2_squared([1,2,3],[1,1,1]) | l2_squared('[1,1,1]','[1,2,3]') |
+-------------------------+-----------------------------+---------------------------------+
| 0 | 5 | 5 |
+-------------------------+-----------------------------+---------------------------------+
1 row in set
L1_distance
The Manhattan distance is used to calculate the sum of absolute axis distances between two points in a standard coordinate system. The calculation formula is as follows:

The function syntax is as follows:
l1_distance(vector v1, vector v2)
The parameters are described as follows:
-
Apart from the vector type, other types that can be forcibly converted to the vector type are accepted, including single-level array types (such as
[1,2,3,..]) and string types (such as'[1,2,3]'). -
The dimensions of the two parameters must be the same.
-
If a single-level array type parameter exists, its elements cannot be
NULL.
The return values are described as follows:
-
The return value is a
distance(double)distance value. -
If any parameter is
NULL, the return value isNULL.
Here is an example:
CREATE TABLE t2(c1 vector(3));
INSERT INTO t2 VALUES('[1,2,3]');
INSERT INTO t2 VALUES('[1,1,1]');
SELECT l1_distance(c1, [1,2,3]) FROM t2;
The return result is as follows:
+--------------------------+
| l1_distance(c1, [1,2,3]) |
+--------------------------+
| 0 |
| 3 |
+--------------------------+
2 rows in set
Cosine_distance
Cosine similarity measures the angular difference between two vectors and reflects their directional similarity, regardless of their lengths (magnitude). The value range of cosine similarity is [-1, 1], where 1 indicates vectors in exactly the same direction, 0 indicates orthogonality, and -1 indicates completely opposite directions.
The calculation method for cosine similarity is as follows:

Since cosine similarity closer to 1 indicates greater similarity, cosine distance (or cosine dissimilarity) is sometimes used as a measure of distance between vectors. Cosine distance can be calculated by subtracting cosine similarity from 1:

The value range of cosine distance is [0, 2], where 0 indicates exactly the same direction (no distance) and 2 indicates completely opposite directions.
The function syntax is as follows:
cosine_distance(vector v1, vector v2)
The parameters are described as follows:
-
Apart from the vector type, other types that can be forcibly converted to the vector type are accepted, including single-level array types (such as
[1,2,3,..]) and string types (such as'[1,2,3]'). -
The dimensions of the two parameters must be the same.
-
If a single-level array type parameter exists, its elements cannot be
NULL.
The return values are described as follows:
-
The return value is a
distance(double)distance value. -
If any parameter is
NULL, the return value isNULL.
Here is an example:
CREATE TABLE t3(c1 vector(3));
INSERT INTO t3 VALUES('[1,2,3]');
INSERT INTO t3 VALUES('[1,2,1]');
SELECT cosine_distance(c1, [1,2,3]) FROM t3;
+------------------------------+
| cosine_distance(c1, [1,2,3]) |
+------------------------------+
| 0 |
| 0.12712843905603044 |
+------------------------------+
2 rows in set
Inner_product
The inner product, also known as the dot product or scalar product, represents a type of multiplication between two vectors. In geometric terms, the inner product indicates the direction and magnitude relationship between two vectors. The calculation method for the inner product is as follows:

The syntax is as follows:
inner_product(vector v1, vector v2)
The parameters are described as follows:
-
Apart from the vector type, other types that can be forcibly converted to the vector type are accepted, including single-level array types (such as
[1,2,3,..]) and string types (such as'[1,2,3]'). -
The dimensions of the two parameters must be the same.
-
If a single-level array type parameter exists, its elements cannot be
NULL. -
For sparse vectors using this function, one parameter can be a string in sparse vector format, such as
c2,'{1:2.4}'. Two parameters cannot both be strings.
The return values are described as follows:
-
The return value is a
distance(double)distance value. -
If any parameter is
NULL, the return value isNULL.
Dense vector example:
CREATE TABLE t4(c1 vector(3));
INSERT INTO t4 VALUES('[1,2,3]');
INSERT INTO t4 VALUES('[1,2,1]');
SELECT inner_product(c1, [1,2,3]) FROM t4;
The return result is as follows:
+----------------------------+
| inner_product(c1, [1,2,3]) |
+----------------------------+
| 14 |
| 8 |
+----------------------------+
2 rows in set
Sparse vector example:
CREATE TABLE t4(c1 INT, c2 SPARSEVECTOR, c3 SPARSEVECTOR);
INSERT INTO t4 VALUES(1, '{1:1.1, 2:2.2}', '{1:2.4}');
INSERT INTO t4 VALUES(2, '{1:1.5, 3:3.6}', '{4:4.5}');
SELECT inner_product(c2,c3) FROM t4;
The return result is as follows:
```shell
+----------------------+
| inner_product(c2,c3) |
+----------------------+
| 2.640000104904175 |
| 0 |
+----------------------+
2 rows in set
Vector_distance
The vector_distance function calculates the distance between two vectors. You can specify parameters to select different distance algorithms.
The syntax is as follows:
vector_distance(vector v1, vector v2 [, string metric])
The vector v1/v2 parameters are described as follows:
-
Apart from the vector type, other types that can be forcibly converted to the vector type are accepted, including single-level array types (such as
[1,2,3,..]) and string types (such as'[1,2,3]'). -
The dimensions of the two parameters must be the same.
-
If a single-level array type parameter exists, its elements cannot be
NULL.
The metric parameter is used to specify the distance algorithm. Options:
-
If not specified, the default algorithm is
euclidean. -
If specified, the only valid values are:
-
euclidean. Represents Euclidean distance, same as L2_distance. -
manhattan. Represents Manhattan distance, same as L1_distance. -
cosine. Represents cosine distance, same as Cosine_distance. -
dot. Represents inner product, same as Inner_product.
-
The return values are described as follows:
-
The return value is a
distance(double)distance value. -
If any parameter is
NULL, the return value isNULL.
Here is an example:
CREATE TABLE t5(c1 vector(3));
INSERT INTO t5 VALUES('[1,2,3]');
INSERT INTO t5 VALUES('[1,2,1]');
SELECT vector_distance(c1, [1,2,3], euclidean) FROM t5;
The return result is as follows:
+-----------------------------------------+
| vector_distance(c1, [1,2,3], euclidean) |
+-----------------------------------------+
| 0 |
| 2 |
+-----------------------------------------+
2 rows in set
Arithmetic functions
Arithmetic functions provide element-wise addition (+), subtraction (-), and multiplication (*) operations between vector types and vector types, single-level array types, and special string types, as well as between single-level array types and single-level array types, and special string types. The calculation method is element-wise, as shown for addition:

The syntax is as follows:
v1 + v2
v1 - v2
v1 * v2
The parameters are described as follows:
-
Apart from the vector type, other types that can be forcibly converted to the vector type are accepted, including single-level array types (such as
[1,2,3,..]) and string types (such as'[1,2,3]'). Note: Two parameters cannot both be string types. At least one parameter must be a vector or single-level array type. -
The dimensions of the two parameters must be the same.
-
If a single-level array type parameter exists, its elements cannot be
NULL.
The return values are described as follows:
-
When at least one of the two parameters is a vector type, the return value is of the same vector type as the vector parameter.
-
When both parameters are single-level array types, the return value is of the
array(float)type. -
If any parameter is
NULL, the return value isNULL.
Here is an example:
CREATE TABLE t6(c1 vector(3));
INSERT INTO t6 VALUES('[1,2,3]');
SELECT [1,2,3] + '[1.12,1000.0001, -1.2222]', c1 - [1,2,3] FROM t6;
The return result is as follows:
+---------------------------------------+--------------+
| [1,2,3] + '[1.12,1000.0001, -1.2222]' | c1 - [1,2,3] |
+---------------------------------------+--------------+
| [2.12,1002,1.7778] | [0,0,0] |
+---------------------------------------+--------------+
1 row in set
Comparison functions
Comparison functions provide comparison calculations between vector types and vector types, single-level array types, and special string types, including the comparison operators =, !=, >, >=, <, and <=. The calculation method is element-wise dictionary order comparison.
The syntax is as follows:
v1 = v2
v1 != v2
v1 > v2
v1 < v2
v1 >= v2
v1 <= v2
The parameters are described as follows:
-
Apart from the vector type, other types that can be forcibly converted to the vector type are accepted, including single-level array types (such as
[1,2,3,..]) and string types (such as'[1,2,3]').tipOne of the two parameters must be a vector type.
-
The dimensions of the two parameters must be the same.
-
If a single-level array type parameter exists, its elements cannot be
NULL.
The return values are described as follows:
-
The return value is of the bool type.
-
If any parameter is
NULL, the return value isNULL.
Here is an example:
CREATE TABLE t7(c1 vector(3));
INSERT INTO t7 VALUES('[1,2,3]');
SELECT c1 = '[1,2,3]' FROM t7;
The return result is as follows:
+----------------+
| c1 = '[1,2,3]' |
+----------------+
| 1 |
+----------------+
1 row in set
Aggregate functions
Vector columns cannot be used as GROUP BY conditions, and DISTINCT is not supported.
Sum
The Sum function is used to calculate the sum of vectors in a vector column of a table, using element-wise accumulation to obtain the sum vector.
The syntax is as follows:
sum(vector v1)
The parameters are described as follows:
- Only the vector type is supported.
The return values are described as follows:
- The return value is a
sum (vector)value.
Here is an example:
CREATE TABLE t8(c1 vector(3));
INSERT INTO t8 VALUES('[1,2,3]');
SELECT sum(c1) FROM t8;
The return result is as follows:
+---------+
| sum(c1) |
+---------+
| [1,2,3] |
+---------+
1 row in set
Avg
The Avg function is used to calculate the average of vectors in a vector column of a table.
The syntax is as follows:
avg(vector v1)
The parameters are described as follows:
- Only the vector type is supported.
The return values are described as follows:
-
The return value is an
avg (vector)value. -
NULLrows in the vector column are not counted. -
When the input parameter is empty, the output is
NULL.
Here is an example:
CREATE TABLE t9(c1 vector(3));
INSERT INTO t9 VALUES('[1,2,3]');
SELECT avg(c1) FROM t9;
The return result is as follows:
+---------+
| avg(c1) |
+---------+
| [1,2,3] |
+---------+
1 row in set
Other common vector functions
Vector_norm
The Vector_norm function calculates the Euclidean norm (or length) of a vector, which represents the Euclidean distance between the vector and the origin. The calculation formula is as follows:

The syntax is as follows:
vector_norm(vector v1)
The parameters are described as follows:
-
Apart from the vector type, other types that can be forcibly converted to the vector type are accepted, including single-level array types (such as
[1,2,3,..]) and string types (such as'[1,2,3]'). -
If a single-level array type parameter exists, its elements cannot be
NULL.
The return values are described as follows:
-
The return value is a
norm(double)modulus value. -
If the parameter is
NULL, the return value isNULL.
Here is an example:
CREATE TABLE t10(c1 vector(3));
INSERT INTO t10 VALUES('[1,2,3]');
SELECT vector_norm(c1),vector_norm([1,2,3]) FROM t10;
The return result is as follows:
+--------------------+----------------------+
| vector_norm(c1) | vector_norm([1,2,3]) |
+--------------------+----------------------+
| 3.7416573867739413 | 3.7416573867739413 |
+--------------------+----------------------+
1 row in set
Vector_dims
The Vector_dims function is used to return the vector dimension.
The syntax is as follows:
vector_dims(vector v1)
The parameters are described as follows:
- Apart from the vector type, other types that can be forcibly converted to the vector type are accepted, including single-level array types (such as
[1,2,3,..]) and string types (such as'[1,2,3]').
The return values are described as follows:
-
The return value is a
dims(int64)dimension value. -
If the parameter is
NULL, an error is returned.
Here is an example:
CREATE TABLE t11(c1 vector(3));
INSERT INTO t11 VALUES('[1,2,3]');
INSERT INTO t11 VALUES('[1,1,1]');
SELECT vector_dims(c1), vector_dims('[1,2,3]') FROM t11;
The return result is as follows:
+-----------------+------------------------+
| vector_dims(c1) | vector_dims('[1,2,3]') |
+-----------------+------------------------+
| 3 | 3 |
| 3 | 3 |
+-----------------+------------------------+
2 rows in set