Vector index selection recommendations
SeekDB provides vector indexes with various algorithms. You can select an appropriate index type based on your use case.
This topic provides recommendations for selecting and optimizing dense vector indexes.
HNSW or IVF?
The dense vector index of seekdb is divided into two major categories:
- Graph-based HNSW series indexes: HNSW, HNSW_SQ, and HNSW_BQ.
- Disk-based IVF series indexes: IVF and IVF_PQ.
Each index type has its own strengths. The HNSW series indexes usually provide higher query performance but require more resident memory. The IVF series indexes perform well when cache is sufficient and can run without relying on resident memory. However, the choice between HNSW and IVF is not solely based on memory considerations. You must evaluate multiple factors such as business scenarios, data scale, performance metrics, and resource constraints. The following sections compare the core differences between the two index types and provide recommendations.
Core differences
| Dimension | HNSW series | IVF series |
|---|---|---|
| Storage method | In-memory graph-based index | Disk-based index |
| Memory usage | Requires complete loading into memory, high memory usage | Does not require resident memory, low memory usage |
| Query performance (QPS) | Very high, with millisecond-level response | Relatively high, close to HNSW when cache is sufficient |
| Recall rate | High, up to 99% | Relatively high, slightly lower than HNSW, can be optimized by parameters |
| Build speed | Slower, requires building the graph structure | Faster, based on clustering algorithms |
| Applicable data volume | Millions to billions | Millions to tens of billions |
| Cost | High memory cost | Low storage cost, suitable for large-scale data |
| Real-time capability | Supports real-time DML operations | Supports real-time DML operations |
Decision flowchart
- Before selecting an index type, you need to estimate the memory usage based on Vector index memory management.
- The recommendations in the decision flowchart are based on 1024-dimensional vectors. If the actual dimension is different, you can approximately calculate the required resources based on the proportion.
- The decision flowchart primarily considers memory cost to help you decide on the index type.
- Even if the tenant has sufficient memory, it is not always optimal to choose the highest specification index such as HNSW. If you require extremely high performance, you can consider more cost-effective options such as HNSW_SQ.

The following sections only list some common scenarios. If you cannot determine the index type through the above flowchart or have other requirements, please contact seekdb technical support.
Should I use a partitioned table?
The primary purpose of using a partitioned table is to handle large-scale data scenarios. Additionally, if the query conditions can be used as a partition key, partition pruning can enhance query performance. It is recommended to use a partitioned table in the following two scenarios:
- When the data volume reaches tens of millions or even hundreds of millions: When the data volume is very large, using a partitioned table can distribute the data across multiple partitions, each with its own independent index, thereby reducing the load of a single query and improving overall query performance.
- When the query conditions include a scalar column that can be used for partition pruning: For example, if the
labelfield is always present in theWHEREclause, you can consider creating a partitioned table withlabelas the partition key. This way, partition pruning can reduce the number of partitions that need to be queried.
The specific recommendations are as follows:
Partitioning
When using vector indexes, the number of partitions should not be excessive. Unlike scalar indexes, vector indexes (such as HNSW) do not significantly increase the computational overhead for querying the top K results when the index size increases from 1 million to 2 million vectors under the same configuration. Therefore, if partition pruning cannot be used, having too many partitions may actually reduce performance. Additionally, a single partition that is too large not only increases the time required for index rebuilding but also affects the efficiency of joint queries with scalar conditions.
In summary, it is recommended to keep the data volume in each partition below 20 million and prioritize selecting a field that supports partition pruning as the partition key.
Algorithm Selection
For large-scale data, it is recommended to use the HNSW_BQ or IVF_PQ index. If you need to use other indexes, refer to the [Memory Usage](#Memory Usage) section for estimation.
Memory Usage
HNSW_BQ
For the HNSW_BQ index, it is recommended that tenant memory > total memory required for HNSW_BQ queries + memory required for a single partition's HNSW_SQ index. The memory estimation process is as follows:
- Use the
INDEX_VECTOR_MEMORY_ADVISORfunction to calculate the recommended memory values for HNSW_BQ and HNSW_SQ indexes during construction and querying. - Based on the recommended memory values obtained from the above tool, calculate and determine the total memory required for the tenant.
For example, assuming 100 million 1024-dimensional vector data with 10 partitions (approximately 10 million vectors per partition), the calculation process is as follows:
-- Specify REFINE_TYPE=SQ8 to directly and accurately obtain the recommended memory values for the HNSW_BQ index during construction and querying.
-- This is because the HNSW_BQ index is constructed using the SQ8 quantization algorithm by default.
-- After specifying refine_type=sq8, the function automatically includes the memory required for SQ8 quantized vectors in the calculation.
-- The recommended memory value for the HNSW_BQ index is 74.6 GB, and the memory consumption during querying is 57.4 GB. We will use the recommended memory value.
SELECT DBMS_VECTOR.INDEX_VECTOR_MEMORY_ADVISOR('HNSW_BQ',100000000,1024,'FLOAT32','M=32,DISTANCE=COSINE,REFINE_TYPE=SQ8', 10000000);
+------------------------------------------------------------------------------------------------------------------------------+
| DBMS_VECTOR.INDEX_VECTOR_MEMORY_ADVISOR('HNSW_BQ',100000000,1024,'FLOAT32','M=32,DISTANCE=COSINE,REFINE_TYPE=SQ8', 10000000) |
+------------------------------------------------------------------------------------------------------------------------------+
| Suggested minimum vector memory is 74.6 GB, memory consumption when providing search service is 57.4 GB |
+------------------------------------------------------------------------------------------------------------------------------+
1 row in set
-- By default, the vector memory occupies 50% of the tenant memory. Therefore, the total tenant memory is
SELECT 74.6/0.5;
+--------------------------------------------------------------------------------------------------------------+
| 74.6/0.5 = 149.2 GB |
+--------------------------------------------------------------------------------------------------------------+
1 row in set
-- Considering that new data may not be compressed in actual environments, it is recommended to reserve some redundancy.
SELECT 149.2 * 1.2;
+--------------------------------------------------------------------------------------------------------------+
| 149.2 * 1.2 = 179.04 GB |
+--------------------------------------------------------------------------------------------------------------+
1 row in set
-- Therefore, it is recommended to configure the tenant memory to 179 GB.
IVF Series
For IVF and IVF_PQ indexes, it is recommended that tenant memory > memory required for constructing a single partition + total memory required for all partitions. For example, with 100 million data records and 10 partitions (approximately 10 million vectors per partition), constructing a single partition requires about 2.7 GB of memory, and the total memory required for querying the 10 partitions' IVF_PQ indexes is approximately 1.1 GB (110 MB per partition). Therefore, a minimum of about 3 GB of memory is required. It is recommended to reserve some extra memory, and 6 GB is suggested. Other data volume scenarios can be estimated using the same method.
Construction and Query Parameters
The index construction and query parameters should be set based on the maximum data volume of a single partition. For more details, see the [Index Parameter Recommendations](#Index parameter recommendations) section.
Performance and Recall Rate
- If the query conditions can be pruned to a single partition: The performance and recall rate will be consistent with the single-partition scenario. For more details, see the section on different data volume scales.
- If the query conditions cannot be pruned to a single partition: The QPS can be estimated proportionally based on the number of partitions on a single OBServer node (e.g., if a single node has 3 partitions, the QPS will be approximately 1/3 of the single-partition performance). Since cross-partition queries merge more candidate results, the actual recall rate is usually higher than in the single-partition scenario.
Index parameter recommendations
HNSW series
The recommended index construction and query parameters vary depending on the data volume for indexes in the same series. This section provides suggested configurations for HNSW/HNSW_SQ/HNSW_BQ indexes with 768-dimensional vectors for 1 million and 10 million data points. For vector data with billions of data points, refer to the subsequent sections for IVF_PQ indexes or HNSW_BQ indexes under partitioned tables.
For scenarios where data volume is expected to grow, it is recommended to set parameters based on the final data volume.
HNSW_BQ is a high-compression quantization algorithm. Its recall rate may be relatively low for low-dimensional vectors. To avoid performance loss, it is recommended to use HNSW_BQ for vectors with 512 dimensions or higher.
| Scenario | Index Type | Parameter Recommendation |
|---|---|---|
| Highest Recall (Maximum Memory Usage) | HNSW | For 1 million data points: m = 16, ef_construction = 200, ef_search = 100, other parameters default |
| Highest Performance (Minimum Memory Usage) | HNSW_SQ | For 1 million data points: m = 16, ef_construction = 200, ef_search = 100, other parameters default For 10 million data points: m = 32, ef_construction = 400, ef_search = 350, other parameters default |
| Best Cost-Performance (Low Memory Usage, Good Performance) | HNSW_BQ | For 1 million data points: m = 16, ef_construction = 200, ef_search = 100, other parameters default For 10 million data points: m = 32, ef_construction = 400, ef_search = 1000, refine_k = 10, other parameters default For 1 billion data points: Use partitioned tables, m = 32, ef_construction = 400, ef_search = 1000, refine_k = 10, other parameters default |
Detailed Information for 1 Million Data Points
Example: 1 million 768-dimensional vectors (using the construction parameters in the table above). Additional information on memory usage and recall rate optimization:
Memory Usage:
| Index Type | Recommended Tenant Memory | Description |
|---|---|---|
| HNSW | 15 GB | The recommended size of the vector index memory is 7.3 GB. If the tenant memory exceeds 8 GB, the vector index can use up to 50% of the tenant memory by default. If the tenant memory is 8 GB or less, the vector index can use up to 40% of the tenant memory by default. Therefore, approximately 15 GB of tenant memory is required. |
| HNSW_SQ | 6 GB | The recommended size of the vector index memory is 2.1 GB. |
| HNSW_BQ | 6 GB | During the construction of the HNSW_BQ index, high-precision vectors are required, so the memory requirements are the same as those for HNSW_SQ, which is 6 GB. The HNSW_BQ index uses the SQ8 quantization method by default. After the index is built, the memory usage decreases significantly to about 405 MB. The above information applies to non-partitioned tables. If partitioned tables are used, seekdb dynamically adjusts the number of partitions that can be built simultaneously based on the tenant's allocated memory. When configuring, it is recommended to ensure that the tenant memory includes the memory required for HNSW_BQ queries and the memory required for HNSW_SQ during the construction of a partition, with some redundancy. For specific calculation methods, refer to the Memory Usage section above. |
Recall Rate Optimization:
By adjusting the ef_search and refine_k (only for HNSW_BQ) parameters, you can increase the number of vector calculations to improve the recall rate, but this will reduce query performance. For different TopN values, you can set the parameters to the recommended values in the table below. If you need to further improve the recall rate, you can set the parameter values to higher values.
The recall rate is directly related to the data characteristics. The values in the table below are recommended for a standard dataset with 768 dimensions, where the recall rate reaches approximately 0.95.
| TopN | ef_search | refine_k (only for HNSW_BQ) |
|---|---|---|
| Top10 | 64 | 4 |
| Top100 | 240 | 4 |
Maximum Recall Rate:
The maximum recall rates vary among different index algorithms. Under the recommended construction parameters in this section, setting ef_search to 1000 may improve the query recall rate, but the QPS will drop to one-third of the value at a 0.95 recall rate. Increasing ef_search does not guarantee an infinite increase in recall rate for all index types. Only the HNSW index can achieve a recall rate of 0.99 or higher. The HNSW_BQ index can further increase the recall rate by increasing refine_k, but this will also lead to a performance decrease.
Maximum Recall Rate (ef_search = 1000):
- HNSW Index: Recall Rate 0.991
- HNSW_SQ Index: Recall Rate 0.9786
- HNSW_BQ Index: Recall Rate 0.9897 (ef_search = 1000, refine_k = 10)
Detailed Information for 10 Million Data Points
Example: 10 million 768-dimensional vectors (using the construction parameters in the table above). Additional information on memory usage and recall rate optimization:
Memory Usage:
| Index Type | Recommended Tenant Memory | Description |
|---|---|---|
| HNSW | 160 GB | The recommended size of the vector index memory is 76.3 GB. |
| HNSW_SQ | 48 GB | The recommended size of the vector index memory is 22.6 GB. |
| HNSW_BQ | 48 GB | During the construction of the HNSW_BQ index, high-precision vectors are required. The HNSW_BQ index uses HNSW_SQ as the cache during index construction by default. Therefore, for non-partitioned tables, the memory required for HNSW_BQ is the same as that for HNSW_SQ. After construction, the HNSW_BQ index only occupies about 5.4 GB of memory. |
Recall Rate Optimization:
By adjusting the ef_search and refine_k (only for HNSW_BQ) parameters, you can increase the number of vector calculations to improve the recall rate, but this will reduce query performance. For different TopN values, you can set the parameters to the recommended values in the table below. If you need to further improve the recall rate, you can set the parameter values to higher values.
The recall rate is directly related to the data characteristics. The values in the table below are recommended for a standard dataset with 768 dimensions, where the recall rate reaches approximately 0.95.
| TopN | ef_search | refine_k (only for HNSW_BQ) |
|---|---|---|
| Top10 | 100 | 4 |
| Top100 (HNSW/HNSW_SQ) | 350 | - |
| Top100 (HNSW_BQ) | 1000 | 10 |
IVF series
| Scenario | Index Type | Parameter Recommendations |
|---|---|---|
| Low-dimensional (384 dimensions or fewer) | IVF | For 10 million data points: Use a partitioned table with nlist=3000 For 100 million data points: Use a partitioned table with nlist=3000 |
| Low-cost (Minimal memory usage) | IVF_PQ | For 1 million data points: nlist=1000, m=vector dimension/2 For 10 million data points: nlist=3000, m=vector dimension/2 For 100 million data points: Use a partitioned table with nlist=3000, m=vector dimension/2 |
To balance the number of cluster centers and the amount of data per center, we recommend setting nlist to the square root of the data volume. For example, for 10 million data points, we recommend setting nlist to around 3000. When using IVF_PQ, we recommend setting m to half of the vector dimension (dim).
IVF_PQ is a high-compression quantization algorithm. Its recall rate may be relatively low for low-dimensional vectors. To avoid performance loss, we recommend using IVF_PQ for vectors with 128 dimensions or more.
For scenarios where the data volume is expected to grow, we recommend setting parameters based on the final data volume.
10 million data points (with partitioned tables)
For 10 million 768-dimensional vector data points (using the parameters in the table above), we provide additional details on memory usage and recall rate optimization:
Memory Usage:
| Index Type | Index Parameters | Memory Usage (Build Time / Resident Time) |
|---|---|---|
| IVF | distance=l2, nlist=3000 | 2.7 GB / 10.5 MB |
| IVF_PQ | distance=l2, nlist=3000, m=384 | 4.0 GB / 1.3 GB |
| IVF_PQ | distance=cosine, nlist=3000, m=384 | 2.7 GB / 11.4 MB |
In the table, Build Time indicates the temporary memory occupied during index creation, which is released after the index is built. Resident Time refers to the memory continuously occupied by the IVF vector index after the index is built.
For IVF_PQ indexes, if you choose distance=l2, the resident memory will be higher due to the need to store additional precomputed results. In contrast, using distance=inner_product or cosine consumes less resident memory. Therefore, in practical applications, we recommend prioritizing inner_product or cosine as the distance type to optimize memory resources.
Recall Rate Optimization:
By adjusting the nprobes parameter, you can increase the recall rate through more vector calculations, but this will decrease query performance. For different TopN values, you can set the parameter to the recommended values in the table below. If you need to further improve the recall rate, you can set the parameter to a larger value.
Recall rate is directly related to data characteristics. The values in the table below are recommended for a standard dataset with 768 dimensions, where the recall rate reaches around 0.95.
| TopN | nprobes |
|---|---|
| Top10 | 1 |
| Top100 | 20 |
100 million data points (with partitioned tables)
When the vector data volume reaches 100 million or more, we strongly recommend using partitioned tables in combination with IVF-based indexes. As the data scale and nlist parameter increase, the query overhead of a single IVF index will significantly increase. By splitting the data into multiple partitions and building smaller-scale IVF indexes for each partition, we can effectively reduce the query load and further improve overall query performance and recall rate through parallel queries across partitions.
In a multi-partitioned table scenario, since IVF indexes are local, each partition will independently build an IVF index. Therefore, we recommend calculating the nlist value based on the average data volume of each partition. For example, for 100 million 768-dimensional vectors split into 10 partitions, each partition contains approximately 10 million vectors, and nlist is recommended to be set to sqrt(10 million) = 3162.
For 100 million 768-dimensional vector data points (using the parameters in the table above), we provide additional details on memory usage and recall rate optimization:
Memory Usage:
In a multi-partitioned scenario, since IVF indexes are local, each partition will independently build and maintain an IVF index. Therefore, the total resident memory usage needs to be summed up based on the number of partitions. For example, if a single IVF index has a resident memory usage of 10.5 MB, and there are 10 partitions, the total resident memory usage would be approximately 10.5 × 10 = 105 MB.
| Index Type | Index Parameters | Memory Usage (Build Time / Resident Time) |
|---|---|---|
| IVF | distance=l2, nlist=3000 | 2.7 GB / 10.5 * 10 MB |
| IVF_PQ | distance=l2, nlist=3000, m=384 | 4.0 GB / 1.3 * 10 GB |
| IVF_PQ | distance=cosine, nlist=3000, m=384 | 2.7 GB / 11.4 * 10 MB |
In the table, Build Time indicates the temporary memory occupied during index creation, which is released after the index is built. Resident Time refers to the memory continuously occupied by the IVF vector index after the index is built.
Recall Rate Optimization:
By adjusting the nprobes parameter, you can increase the recall rate through more vector calculations, but this will decrease query performance. For different TopN values, you can set the parameter to the recommended values in the table below. If you need to further improve the recall rate, you can set the parameter to a larger value.
In a partitioned table scenario, since each partition will independently execute IVF index queries, when you query across multiple partitions, the system will separately retrieve the TopN results from each partition and then aggregate and re-sort all the results. This not only improves the overall search accuracy but also typically results in a higher actual recall rate compared to a single-partition scenario. Therefore, in a multi-partitioned table, you can appropriately reduce the nprobes parameter to achieve a recall rate comparable to that of a single-partitioned table.
Recall rate is directly related to data characteristics. The values in the table below are recommended for a standard dataset with 768 dimensions, where the recall rate reaches around 0.95.
| TopN | nprobes |
|---|---|
| Top10 | 1 |
| Top100 | 10 |
References
- For more information about how to create an index, see Create an index.