Compatibility
This topic describes the data model mappings, SDK interface compatibility, and concept mappings between seekdb's vector search feature and Milvus.
Concept mappings
To help users familiar with Milvus quickly get started with seekdb's vector storage capabilities, we analyze the similarities and differences between the two systems and provide a mapping of related concepts.
Data models
| Data model layer | Milvus | seekdb | Description |
|---|---|---|---|
| First layer | Shards | Partition | Milvus specifies partition rules by setting some columns as partition_key in the schema definition.seekdb supports range/range columns, list/list columns, hash, key, and subpartitioning strategies. |
| Second layer | Partitions | ≈Tablet | Milvus enhances read performance by chunking the same shard (shards are usually partitioned by primary key) based on other columns. seekdb implements this by sorting keys within a partition. |
| Third layer | Segments | MemTable+SSTable | Both have a minor compaction mechanism. |
SDKs
This section introduces the conceptual differences between seekdb's vector storage SDK (pyobvector) and Milvus's SDK (pymilvus).
pyobvector supports two usage modes:
-
pymilvus MilvusClient lightweight compatible mode: This mode is compatible with common interfaces of Milvus clients. Users familiar with Milvus can easily use this mode without concept mapping.
-
SQLAlchemy extension mode: This mode can be used as a vector feature extension of python SQLAlchemy, retaining the operation mode of a relational database. Concept mapping is required.
For more information about pyobvector's APIs, see pyobvector Python SDK API reference.
The following table describes the concept mappings between pyobvector's SQLAlchemy extension mode and pymilvus:
| pymilvus | pyobvector | Description |
|---|---|---|
| Database | Database | Database |
| Collection | Table | Table |
| Field | Column | Column |
| Primary Key | Primary Key | Primary key |
| Vector Field | Vector Column | Vector column |
| Index | Index | Index |
| Partition | Partition | Partition |
| DataType | DataType | Data type |
| Metric Type | Distance Function | Distance function |
| Search | Query | Query |
| Insert | Insert | Insert |
| Delete | Delete | Delete |
| Update | Update | Update |
| Batch | Batch | Batch operations |
| Transaction | Transaction | Transaction |
| NONE | Not supported | NULL value |
| BOOL | Boolean | Corresponds to the MySQL TINYINT type |
| INT8 | Boolean | Corresponds to the MySQL TINYINT type |
| INT16 | SmallInteger | Corresponds to the MySQL SMALLINT type |
| INT32 | Integer | Corresponds to the MySQL INT type |
| INT64 | BigInteger | Corresponds to the MySQL BIGINT type |
| FLOAT | Float | Corresponds to the MySQL FLOAT type |
| DOUBLE | Double | Corresponds to the MySQL DOUBLE type |
| STRING | LONGTEXT | Corresponds to the MySQL LONGTEXT type |
| VARCHAR | STRING | Corresponds to the MySQL VARCHAR type |
| JSON | JSON | For differences and similarities in JSON operations, see pyobvector Python SDK API reference. |
| FLOAT_VECTOR | VECTOR | Vector type |
| BINARY_VECTOR | Not supported | |
| FLOAT16_VECTOR | Not supported | |
| BFLOAT16_VECTOR | Not supported | |
| SPARSE_FLOAT_VECTOR | Not supported | |
| dynamic_field | Not needed | The hidden $meta metadata column in Milvus.In seekdb, you can explicitly create a JSON-type column. |
Compatibility with Milvus
Milvus SDK
Except load_collection(), release_collection(), and close(), which are supported through SQLAlchemy, all operations listed in the following tables are supported.
Collection operations
| Interface | Description |
|---|---|
| create_collection() | Creates a vector table based on the given schema. |
| get_collection_stats() | Queries table statistics, such as the number of rows. |
| describe_collection() | Provides detailed metadata of a vector table. |
| has_collection() | Checks whether a table exists. |
| list_collections() | Lists existing tables. |
| drop_collection() | Drops a table. |
Field and schema definition
| Interface | Description |
|---|---|
| create_schema() | Creates a schema in memory and adds column definitions. |
| add_field() | The call sequence is: create_schema->add_field->...->add_field You can also manually build a FieldSchema list and then use the CollectionSchema constructor to create a schema. |
Vector indexes
| Interface | Description |
|---|---|
| list_indexes() | Lists all indexes. |
| create_index() | Supports creating multiple vector indexes in a single call. First, use prepare_index_params to initialize an index parameter list object, call add_index multiple times to set multiple index parameters, and finally call create_index to create the indexes. |
| drop_index() | Drops a vector index. |
| describe_index() | Gets the metadata (schema) of an index. |
Vector indexes
| Interface | Description |
|---|---|
| search() | ANN query interface:
|
| query() | Point query with filter, namely SELECT ... WHERE ids IN (..., ...) AND <filters>. |
| get() | Point query without filter, namely SELECT ... WHERE ids IN (..., ...). |
| delete() | Deletes a group of vectors, DELETE FROM ... WHERE ids IN (..., ...). |
| insert() | Inserts a group of vectors. |
| upsert() | Insert with update on primary key conflict. |
Collection metadata synchronization
| Interface | Description |
|---|---|
| load_collection() | Loads the table structure from the database to the Python application memory, enabling the application to operate the database table in an object-oriented manner. This is a standard feature of an object-relational mapping (ORM) framework. |
| release_collection() | Releases the loaded table structure from the Python application memory and releases related resources. This is a standard feature of an ORM framework for memory management. |
| close() | Closes the database connection and releases related resources. This is a standard feature of an ORM framework. |
pymilvus
Data model
The data model of Milvus comprises three levels: Shards->Partitions->Segments. Compatibility with seekdb is described as follows:
-
Shards correspond to seekdb's Partition concept.
-
Partitions currently have no corresponding concept in seekdb.
-
Milvus allows you to partition a shard into blocks by other columns to improve read performance (shards are usually partitioned by primary key). seekdb implements this by sorting by primary key within a partition.
Milvus Lite API compatibility
collection operations
-
Milvus create_collection():
create_collection(
collection_name: str,
dimension: int,
primary_field_name: str = "id",
id_type: str = DataType,
vector_field_name: str = "vector",
metric_type: str = "COSINE",
auto_id: bool = False,
timeout: Optional[float] = None,
schema: Optional[CollectionSchema] = None, # Used for custom setup
index_params: Optional[IndexParams] = None, # Used for custom setup
**kwargs,
) -> Noneseekdb compatibility is described as follows:
-
collection_name: compatible, corresponds to table_name.
-
dimension: compatible, vector(dim).
-
primary_field_name: compatible, the primary key column name.
-
id_type: compatible, the primary key column type.
-
vector_field_name: compatible, the vector column name.
-
auto_id: compatible, auto increment.
-
timeout: compatible, seekdb supports it through hint.
-
schema: compatible.
-
index_params: compatible.
-
-
Milvus get_collection_stats():
get_collection_stats(
collection_name: str,
timeout: Optional[float] = None
) -> Dictseekdb compatibility is described as follows:
-
API is compatible.
-
Return value is compatible:
{ 'row_count': ... }.
-
-
Milvus has_collection():
has_collection(
collection_name: str,
timeout: Optional[float] = None
) -> Boolseekdb is compatible with Milvus has_collection().
-
Milvus drop_collection():
drop_collection(collection_name: str) -> Noneseekdb is compatible with Milvus drop_collection().
-
Milvus rename_collection():
rename_collection(
old_name: str,
new_name: str,
timeout: Optional[float] = None
) -> Noneseekdb is compatible with Milvus rename_collection().
Schema-related
-
Milvus create_schema():
create_schema(
auto_id: bool,
enable_dynamic_field: bool,
primary_field: str,
partition_key_field: str,
) -> CollectionSchemaseekdb compatibility is described as follows:
-
auto_id: whether the primary key column is auto-increment, compatible.
-
primary_field & partition_key_field: compatible.
-
-
Milvus add_field():
add_field(
field_name: str,
datatype: DataType,
is_primary: bool,
max_length: int,
element_type: str,
max_capacity: int,
dim: int,
is_partition_key: bool,
)seekdb is compatible with Milvus add_field().
Insert/Search-related
-
Milvus search():
search(
collection_name: str,
data: Union[List[list], list],
filter: str = "",
limit: int = 10,
output_fields: Optional[List[str]] = None,
search_params: Optional[dict] = None,
timeout: Optional[float] = None,
partition_names: Optional[List[str]] = None,
**kwargs,
) -> List[dict]seekdb compatibility is described as follows:
-
filter: string expression. For usage examples, see: Milvus Filtering Explained. It is generally similar to SQL's
WHEREexpression. -
search_params:
-
metric_type: compatible.
-
radius & range filter: related to RNN, currently not supported.
-
group_by_field: groups ANN results, currently not supported.
-
max_empty_result_buckets: used for IVF series indexes, currently not supported.
-
ignore_growing: skips incremental data and directly reads baseline index, currently not supported.
-
-
partition_names: partition read, supported.
-
kwargs:
-
offset: the number of records to skip in search results, currently not supported.
-
round_decimal: rounds results to specified decimal places, currently not supported.
-
-
-
Milvus get():
get(
collection_name: str,
ids: Union[list, str, int],
output_fields: Optional[List[str]] = None,
timeout: Optional[float] = None,
partition_names: Optional[List[str]] = None,
**kwargs,
) -> List[dict]seekdb is compatible with Milvus get().
-
Milvus delete()
delete(
collection_name: str,
ids: Optional[Union[list, str, int]] = None,
timeout: Optional[float] = None,
filter: Optional[str] = "",
partition_name: Optional[str] = "",
**kwargs,
) -> dictseekdb is compatible with Milvus delete().
-
Milvus insert()
insert(
collection_name: str,
data: Union[Dict, List[Dict]],
timeout: Optional[float] = None,
partition_name: Optional[str] = "",
) -> List[Union[str, int]]seekdb is compatible with Milvus insert().
-
Milvus upsert()
upsert(
collection_name: str,
data: Union[Dict, List[Dict]],
timeout: Optional[float] = None,
partition_name: Optional[str] = "",
) -> List[Union[str, int]]seekdb is compatible with Milvus upsert().
Index-related
-
Milvus create_index()
create_index(
collection_name: str,
index_params: IndexParams,
timeout: Optional[float] = None,
**kwargs,
)seekdb is compatible with Milvus create_index().
-
Milvus drop_index()
drop_index(
collection_name: str,
index_name: str,
timeout: Optional[float] = None,
**kwargs,
)seekdb is compatible with Milvus drop_index().
Compatibility with MySQL protocol
-
In terms of request initiation: All APIs are implemented through general query SQL, and there are no compatibility issues.
-
In terms of response result set processing: Only processing of new vector data elements needs to be considered. Currently, string and bytes element parsing are supported. Even if the transmission mode of vector data elements changes in the future, compatibility can be achieved by updating the SDK.