Version: V1.1.0

Vector index selection recommendations

seekdb provides vector indexes with various algorithms. You can select an appropriate index type based on your use case.

note

This topic provides recommendations for selecting and optimizing dense vector indexes.

HNSW or IVF?

The dense vector index of seekdb is divided into two major categories:

Graph-based HNSW series indexes: HNSW, HNSW_SQ, and HNSW_BQ.
Disk-based IVF series indexes: IVF and IVF_PQ.

Each index type has its own strengths. The HNSW series indexes usually provide higher query performance but require more resident memory. The IVF series indexes perform well when cache is sufficient and can run without relying on resident memory. However, the choice between HNSW and IVF is not solely based on memory considerations. You must evaluate multiple factors such as business scenarios, data scale, performance metrics, and resource constraints. The following sections compare the core differences between the two index types and provide recommendations.

Core differences

Dimension	HNSW series	IVF series
Storage method	In-memory graph-based index	Disk-based index
Memory usage	Requires complete loading into memory, high memory usage	Does not require resident memory, low memory usage
Query performance (QPS)	Very high, with millisecond-level response	Relatively high, close to HNSW when cache is sufficient
Recall rate	High, up to 99%	Relatively high, slightly lower than HNSW, can be optimized by parameters
Build speed	Slower, requires building the graph structure	Faster, based on clustering algorithms
Applicable data volume	Millions to billions	Millions to tens of billions
Cost	High memory cost	Low storage cost, suitable for large-scale data
Real-time capability	Supports real-time DML operations	Supports real-time DML operations

Decision flowchart

note

Before selecting an index type, you need to estimate the memory usage based on Vector index memory management.
The recommendations in the decision flowchart are based on 1024-dimensional vectors. If the actual dimension is different, you can approximately calculate the required resources based on the proportion.
The decision flowchart primarily considers memory cost to help you decide on the index type.
Even if the tenant has sufficient memory, it is not always optimal to choose the highest specification index such as HNSW. If you require extremely high performance, you can consider more cost-effective options such as HNSW_SQ.

Decision flowchart

note

The following sections only list some common scenarios. If you cannot determine the index type through the above flowchart or have other requirements, please contact seekdb technical support.

Should I use a partitioned table?

The primary purpose of using a partitioned table is to handle large-scale data scenarios. Additionally, if the query conditions can be used as a partition key, partition pruning can enhance query performance. It is recommended to use a partitioned table in the following two scenarios:

When the data volume reaches tens of millions or even hundreds of millions: When the data volume is very large, using a partitioned table can distribute the data across multiple partitions, each with its own independent index, thereby reducing the load of a single query and improving overall query performance.
When the query conditions include a scalar column that can be used for partition pruning: For example, if the label field is always present in the WHERE clause, you can consider creating a partitioned table with label as the partition key. This way, partition pruning can reduce the number of partitions that need to be queried.

The specific recommendations are as follows:

Partitioning

When using vector indexes, the number of partitions should not be excessive. Unlike scalar indexes, vector indexes (such as HNSW) do not significantly increase the computational overhead for querying the top K results when the index size increases from 1 million to 2 million vectors under the same configuration. Therefore, if partition pruning cannot be used, having too many partitions may actually reduce performance. Additionally, a single partition that is too large not only increases the time required for index rebuilding but also affects the efficiency of joint queries with scalar conditions.

In summary, it is recommended to keep the data volume in each partition below 20 million and prioritize selecting a field that supports partition pruning as the partition key.

Algorithm Selection

For large-scale data, it is recommended to use the HNSW_BQ or IVF_PQ index. If you need to use other indexes, refer to the [Memory Usage](#Memory Usage) section for estimation.

Memory Usage

HNSW_BQ

For the HNSW_BQ index, it is recommended that tenant memory > total memory required for HNSW_BQ queries + memory required for a single partition's HNSW_SQ index. The memory estimation process is as follows:

Use the INDEX_VECTOR_MEMORY_ADVISOR function to calculate the recommended memory values for HNSW_BQ and HNSW_SQ indexes during construction and querying.
Based on the recommended memory values obtained from the above tool, calculate and determine the total memory required for the tenant.

For example, assuming 100 million 1024-dimensional vector data with 10 partitions (approximately 10 million vectors per partition), the calculation process is as follows:

-- Specify REFINE_TYPE=SQ8 to directly and accurately obtain the recommended memory values for the HNSW_BQ index during construction and querying.
-- This is because the HNSW_BQ index is constructed using the SQ8 quantization algorithm by default.
-- After specifying refine_type=sq8, the function automatically includes the memory required for SQ8 quantized vectors in the calculation.
-- The recommended memory value for the HNSW_BQ index is 74.6 GB, and the memory consumption during querying is 57.4 GB. We will use the recommended memory value.
SELECT DBMS_VECTOR.INDEX_VECTOR_MEMORY_ADVISOR('HNSW_BQ',100000000,1024,'FLOAT32','M=32,DISTANCE=COSINE,REFINE_TYPE=SQ8', 10000000);
+------------------------------------------------------------------------------------------------------------------------------+
| DBMS_VECTOR.INDEX_VECTOR_MEMORY_ADVISOR('HNSW_BQ',100000000,1024,'FLOAT32','M=32,DISTANCE=COSINE,REFINE_TYPE=SQ8', 10000000) |
+------------------------------------------------------------------------------------------------------------------------------+
| Suggested minimum vector memory is 74.6 GB, memory consumption when providing search service is 57.4 GB                      |
+------------------------------------------------------------------------------------------------------------------------------+
1 row in set

-- By default, the vector memory occupies 50% of the tenant memory. Therefore, the total tenant memory is
SELECT 74.6/0.5;
+--------------------------------------------------------------------------------------------------------------+
| 74.6/0.5 = 149.2 GB |
+--------------------------------------------------------------------------------------------------------------+
1 row in set

-- Considering that new data may not be compressed in actual environments, it is recommended to reserve some redundancy.
SELECT 149.2 * 1.2;
+--------------------------------------------------------------------------------------------------------------+
| 149.2 * 1.2 = 179.04 GB |
+--------------------------------------------------------------------------------------------------------------+
1 row in set

-- Therefore, it is recommended to configure the tenant memory to 179 GB.

IVF Series

For IVF and IVF_PQ indexes, it is recommended that tenant memory > memory required for constructing a single partition + total memory required for all partitions. For example, with 100 million data records and 10 partitions (approximately 10 million vectors per partition), constructing a single partition requires about 2.7 GB of memory, and the total memory required for querying the 10 partitions' IVF_PQ indexes is approximately 1.1 GB (110 MB per partition). Therefore, a minimum of about 3 GB of memory is required. It is recommended to reserve some extra memory, and 6 GB is suggested. Other data volume scenarios can be estimated using the same method.

Construction and Query Parameters

The index construction and query parameters should be set based on the maximum data volume of a single partition. For more details, see the [Index Parameter Recommendations](#Index parameter recommendations) section.

Performance and Recall Rate

If the query conditions can be pruned to a single partition: The performance and recall rate will be consistent with the single-partition scenario. For more details, see the section on different data volume scales.
If the query conditions cannot be pruned to a single partition: The QPS can be estimated proportionally based on the number of partitions on a single seekdb node (e.g., if a single node has 3 partitions, the QPS will be approximately 1/3 of the single-partition performance). Since cross-partition queries merge more candidate results, the actual recall rate is usually higher than in the single-partition scenario.

Index parameter recommendations

HNSW series

The recommended index construction and query parameters vary depending on the data volume for indexes in the same series. This section provides suggested configurations for HNSW/HNSW_SQ/HNSW_BQ indexes with 768-dimensional vectors for 1 million and 10 million data points. For vector data with billions of data points, refer to the subsequent sections for IVF_PQ indexes or HNSW_BQ indexes under partitioned tables.

note

For scenarios where data volume is expected to grow, it is recommended to set parameters based on the final data volume.

tip

HNSW_BQ is a high-compression quantization algorithm. Its recall rate may be relatively low for low-dimensional vectors. To avoid performance loss, it is recommended to use HNSW_BQ for vectors with 512 dimensions or higher.

Scenario	Index Type	Parameter Recommendation
Highest Recall (Maximum Memory Usage)	HNSW	For 1 million data points: m = 16, ef_construction = 200, ef_search = 100, other parameters default
Highest Performance (Minimum Memory Usage)	HNSW_SQ	For 1 million data points: m = 16, ef_construction = 200, ef_search = 100, other parameters default For 10 million data points: m = 32, ef_construction = 400, ef_search = 350, other parameters default
Best Cost-Performance (Low Memory Usage, Good Performance)	HNSW_BQ	For 1 million data points: m = 16, ef_construction = 200, ef_search = 100, other parameters default For 10 million data points: m = 32, ef_construction = 400, ef_search = 1000, refine_k = 10, other parameters default For 1 billion data points: Use partitioned tables, m = 32, ef_construction = 400, ef_search = 1000, refine_k = 10, other parameters default

Detailed Information for 1 Million Data Points

Example: 1 million 768-dimensional vectors (using the construction parameters in the table above). Additional information on memory usage and recall rate optimization:

Memory Usage:

Index Type	Recommended Tenant Memory	Description
HNSW	15 GB	The recommended size of the vector index memory is 7.3 GB. If the tenant memory exceeds 8 GB, the vector index can use up to 50% of the tenant memory by default. If the tenant memory is 8 GB or less, the vector index can use up to 40% of the tenant memory by default. Therefore, approximately 15 GB of tenant memory is required.
HNSW_SQ	6 GB	The recommended size of the vector index memory is 2.1 GB.
HNSW_BQ	6 GB	During the construction of the HNSW_BQ index, high-precision vectors are required, so the memory requirements are the same as those for HNSW_SQ, which is 6 GB. The HNSW_BQ index uses the SQ8 quantization method by default. After the index is built, the memory usage decreases significantly to about 405 MB. The above information applies to non-partitioned tables. If partitioned tables are used, seekdb dynamically adjusts the number of partitions that can be built simultaneously based on the tenant's allocated memory. When configuring, it is recommended to ensure that the tenant memory includes the memory required for HNSW_BQ queries and the memory required for HNSW_SQ during the construction of a partition, with some redundancy. For specific calculation methods, refer to the Memory Usage section above.

Recall Rate Optimization:

By adjusting the ef_search and refine_k (only for HNSW_BQ) parameters, you can increase the number of vector calculations to improve the recall rate, but this will reduce query performance. For different TopN values, you can set the parameters to the recommended values in the table below. If you need to further improve the recall rate, you can set the parameter values to higher values.

tip

The recall rate is directly related to the data characteristics. The values in the table below are recommended for a standard dataset with 768 dimensions, where the recall rate reaches approximately 0.95.

TopN	ef_search	refine_k (only for HNSW_BQ)
Top10	64	4
Top100	240	4

Maximum Recall Rate:

The maximum recall rates vary among different index algorithms. Under the recommended construction parameters in this section, setting ef_search to 1000 may improve the query recall rate, but the QPS will drop to one-third of the value at a 0.95 recall rate. Increasing ef_search does not guarantee an infinite increase in recall rate for all index types. Only the HNSW index can achieve a recall rate of 0.99 or higher. The HNSW_BQ index can further increase the recall rate by increasing refine_k, but this will also lead to a performance decrease.

Maximum Recall Rate (ef_search = 1000):

HNSW Index: Recall Rate 0.991
HNSW_SQ Index: Recall Rate 0.9786
HNSW_BQ Index: Recall Rate 0.9897 (ef_search = 1000, refine_k = 10)

Detailed Information for 10 Million Data Points

Example: 10 million 768-dimensional vectors (using the construction parameters in the table above). Additional information on memory usage and recall rate optimization:

Memory Usage:

Index Type	Recommended Tenant Memory	Description
HNSW	160 GB	The recommended size of the vector index memory is 76.3 GB.
HNSW_SQ	48 GB	The recommended size of the vector index memory is 22.6 GB.
HNSW_BQ	48 GB	During the construction of the HNSW_BQ index, high-precision vectors are required. The HNSW_BQ index uses HNSW_SQ as the cache during index construction by default. Therefore, for non-partitioned tables, the memory required for HNSW_BQ is the same as that for HNSW_SQ. After construction, the HNSW_BQ index only occupies about 5.4 GB of memory.

Recall Rate Optimization:

tip

TopN	ef_search	refine_k (only for HNSW_BQ)
Top10	100	4
Top100 (HNSW/HNSW_SQ)	350	-
Top100 (HNSW_BQ)	1000	10

IVF series

Scenario	Index Type	Parameter Recommendations
Low-dimensional (384 dimensions or fewer)	IVF	For 10 million data points: Use a partitioned table with nlist=3000 For 100 million data points: Use a partitioned table with nlist=3000
Low-cost (Minimal memory usage)	IVF_PQ	For 1 million data points: nlist=1000, m=vector dimension/2 For 10 million data points: nlist=3000, m=vector dimension/2 For 100 million data points: Use a partitioned table with nlist=3000, m=vector dimension/2

To balance the number of cluster centers and the amount of data per center, we recommend setting nlist to the square root of the data volume. For example, for 10 million data points, we recommend setting nlist to around 3000. When using IVF_PQ, we recommend setting m to half of the vector dimension (dim).

tip

IVF_PQ is a high-compression quantization algorithm. Its recall rate may be relatively low for low-dimensional vectors. To avoid performance loss, we recommend using IVF_PQ for vectors with 128 dimensions or more.

note

For scenarios where the data volume is expected to grow, we recommend setting parameters based on the final data volume.

10 million data points (with partitioned tables)

For 10 million 768-dimensional vector data points (using the parameters in the table above), we provide additional details on memory usage and recall rate optimization:

Memory Usage:

Index Type	Index Parameters	Memory Usage (Build Time / Resident Time)
IVF	distance=l2, nlist=3000	2.7 GB / 10.5 MB
IVF_PQ	distance=l2, nlist=3000, m=384	4.0 GB / 1.3 GB
IVF_PQ	distance=cosine, nlist=3000, m=384	2.7 GB / 11.4 MB

In the table, Build Time indicates the temporary memory occupied during index creation, which is released after the index is built. Resident Time refers to the memory continuously occupied by the IVF vector index after the index is built.

For IVF_PQ indexes, if you choose distance=l2, the resident memory will be higher due to the need to store additional precomputed results. In contrast, using distance=inner_product or cosine consumes less resident memory. Therefore, in practical applications, we recommend prioritizing inner_product or cosine as the distance type to optimize memory resources.

Recall Rate Optimization:

By adjusting the nprobes parameter, you can increase the recall rate through more vector calculations, but this will decrease query performance. For different TopN values, you can set the parameter to the recommended values in the table below. If you need to further improve the recall rate, you can set the parameter to a larger value.

tip

Recall rate is directly related to data characteristics. The values in the table below are recommended for a standard dataset with 768 dimensions, where the recall rate reaches around 0.95.

TopN	nprobes
Top10	1
Top100	20

100 million data points (with partitioned tables)

When the vector data volume reaches 100 million or more, we strongly recommend using partitioned tables in combination with IVF-based indexes. As the data scale and nlist parameter increase, the query overhead of a single IVF index will significantly increase. By splitting the data into multiple partitions and building smaller-scale IVF indexes for each partition, we can effectively reduce the query load and further improve overall query performance and recall rate through parallel queries across partitions.

In a multi-partitioned table scenario, since IVF indexes are local, each partition will independently build an IVF index. Therefore, we recommend calculating the nlist value based on the average data volume of each partition. For example, for 100 million 768-dimensional vectors split into 10 partitions, each partition contains approximately 10 million vectors, and nlist is recommended to be set to sqrt(10 million) = 3162.

For 100 million 768-dimensional vector data points (using the parameters in the table above), we provide additional details on memory usage and recall rate optimization:

Memory Usage:

In a multi-partitioned scenario, since IVF indexes are local, each partition will independently build and maintain an IVF index. Therefore, the total resident memory usage needs to be summed up based on the number of partitions. For example, if a single IVF index has a resident memory usage of 10.5 MB, and there are 10 partitions, the total resident memory usage would be approximately 10.5 × 10 = 105 MB.

Index Type	Index Parameters	Memory Usage (Build Time / Resident Time)
IVF	distance=l2, nlist=3000	2.7 GB / 10.5 * 10 MB
IVF_PQ	distance=l2, nlist=3000, m=384	4.0 GB / 1.3 * 10 GB
IVF_PQ	distance=cosine, nlist=3000, m=384	2.7 GB / 11.4 * 10 MB

Recall Rate Optimization:

In a partitioned table scenario, since each partition will independently execute IVF index queries, when you query across multiple partitions, the system will separately retrieve the TopN results from each partition and then aggregate and re-sort all the results. This not only improves the overall search accuracy but also typically results in a higher actual recall rate compared to a single-partition scenario. Therefore, in a multi-partitioned table, you can appropriately reduce the nprobes parameter to achieve a recall rate comparable to that of a single-partitioned table.

tip

Recall rate is directly related to data characteristics. The values in the table below are recommended for a standard dataset with 768 dimensions, where the recall rate reaches around 0.95.

TopN	nprobes
Top10	1
Top100	10

References

For more information about how to create an index, see Create an index.

HNSW or IVF?​

Core differences​

Decision flowchart​

Should I use a partitioned table?​

Partitioning​

Algorithm Selection​

Memory Usage​

HNSW_BQ​

IVF Series​

Construction and Query Parameters​

Performance and Recall Rate​

Index parameter recommendations​

HNSW series​

Detailed Information for 1 Million Data Points​

Detailed Information for 10 Million Data Points​

IVF series​

10 million data points (with partitioned tables)​

100 million data points (with partitioned tables)​

References​

Contents

HNSW or IVF?

Core differences

Decision flowchart

Should I use a partitioned table?

Partitioning

Algorithm Selection

Memory Usage

HNSW_BQ

IVF Series

Construction and Query Parameters

Performance and Recall Rate

Index parameter recommendations

HNSW series

Detailed Information for 1 Million Data Points

Detailed Information for 10 Million Data Points

IVF series

10 million data points (with partitioned tables)

100 million data points (with partitioned tables)

References