seekdb VectorDBBench test
VectorDBBench is a tool that provides benchmark results for major vector databases and cloud services. This topic describes how to test the performance of seekdb by using VectorDBBench. VectorDBBench is designed for ease of use, allowing you to easily reproduce test results or test new systems.
Prerequisites
-
A server-mode seekdb instance. For more information, see Deploy seekdb by using yum.
-
The deployed seekdb instance is configured with three disks for logs, clogs, and data, respectively, with the performance level set to PL1.
port=2881
base-dir=/data/1/seekdb
data-dir=/data/2/seekdb
redo-dir=/data/3/seekdbtipIf the tested seekdb instance has 1 CPU core and 2 GB of memory, set the
memory_limitparameter to 2G. -
Python 3.11 or later. We recommend that you use Conda to install Python.
# Download and install Conda
mkdir -p ~/miniconda3
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O ~/miniconda3/miniconda.sh
bash ~/miniconda3/miniconda.sh -b -u -p ~/miniconda3
rm ~/miniconda3/miniconda.sh
# Open a terminal window again and run the following commands to initialize Conda:
source ~/miniconda3/bin/activate
conda init --all
# Create and activate a Python environment for VectorDBBench
conda create -n vdb python=3.11
conda activate vdb
Test scenario
-
Two servers are required for the test. Deploy VectorDBBench on one server and seekdb on another. The seekdb instance must be of the 4C8G or 1C2G specifications and configured with three disks for logs, clogs, and data, respectively, with the performance level set to PL1.
-
The test data set is the Performance1536D50K dataset. The index type is HNSW. Use VectorDBBench to test the QPS, latency (p99), and recall rate of seekdb.
Procedure
Clone the code of VectorDBBench
Notice
We recommend that you deploy VectorDBBench and seekdb on different servers to avoid CPU resource competition and ensure the reliability of the test results.
Clone the code of the test tool, VectorDBBench, to the local server.
git clone https://github.com/zilliztech/VectorDBBench.git
Install dependencies
Run the following command to install dependencies in the VectorDBBench directory:
cd VectorDBBench
pip install .
Run the test
Run VectorDBBench. This example uses the HNSW index.
HNSW index
# Replace $host, $port, and $user with the actual connection information of seekdb.
vectordbbench oceanbasehnsw --host $host --port $port --user root --database test --m 16 --ef-construction 200 --case-type Performance1536D50K --index-type HNSW --ef-search 180 --k 100
You can run the following command to view the parameter description.
vectordbbench oceanbasehnsw --help
Here are some common parameters:
--num-concurrency: specifies the number of concurrent requests. VectorDBBench executes vector queries in parallel based on the specified concurrency level and selects the maximum QPS (queries per second) as the result.--skip-drop-old/--skip-load: skip the steps of deleting data and loading data. When you add these options to the command, the command only executes vector queries, without deleting the data or reloading data.--k: specifies the number of nearest neighbor results to return.--ef-search: specifies the size of the candidate set in HNSW queries.--index-type: specifies the index type. Valid values:HNSW,HNSW_SQ, andHNSW_BQ.
Results
For more information, see seekdb VectorDBBench performance test report.