Skip to main content

Compatibility

This topic describes the data model mappings, SDK interface compatibility, and concept mappings between seekdb's vector search feature and Milvus.

Concept mappings

To help users familiar with Milvus quickly get started with seekdb's vector storage capabilities, we analyze the similarities and differences between the two systems and provide a mapping of related concepts.

Data models

Data model layerMilvusseekdbDescription
First layerShardsPartitionMilvus specifies partition rules by setting some columns as partition_key in the schema definition.
seekdb supports range/range columns, list/list columns, hash, key, and subpartitioning strategies.
Second layerPartitions≈TabletMilvus enhances read performance by chunking the same shard (shards are usually partitioned by primary key) based on other columns.
seekdb implements this by sorting keys within a partition.
Third layerSegmentsMemTable+SSTableBoth have a minor compaction mechanism.

SDKs

This section introduces the conceptual differences between seekdb's vector storage SDK (pyobvector) and Milvus's SDK (pymilvus).

pyobvector supports two usage modes:

  1. pymilvus MilvusClient lightweight compatible mode: This mode is compatible with common interfaces of Milvus clients. Users familiar with Milvus can easily use this mode without concept mapping.

  2. SQLAlchemy extension mode: This mode can be used as a vector feature extension of python SQLAlchemy, retaining the operation mode of a relational database. Concept mapping is required.

For more information about pyobvector's APIs, see pyobvector Python SDK API reference.

The following table describes the concept mappings between pyobvector's SQLAlchemy extension mode and pymilvus:

pymilvuspyobvectorDescription
DatabaseDatabaseDatabase
CollectionTableTable
FieldColumnColumn
Primary KeyPrimary KeyPrimary key
Vector FieldVector ColumnVector column
IndexIndexIndex
PartitionPartitionPartition
DataTypeDataTypeData type
Metric TypeDistance FunctionDistance function
SearchQueryQuery
InsertInsertInsert
DeleteDeleteDelete
UpdateUpdateUpdate
BatchBatchBatch operations
TransactionTransactionTransaction
NONENot supportedNULL value
BOOLBooleanCorresponds to the MySQL TINYINT type
INT8BooleanCorresponds to the MySQL TINYINT type
INT16SmallIntegerCorresponds to the MySQL SMALLINT type
INT32IntegerCorresponds to the MySQL INT type
INT64BigIntegerCorresponds to the MySQL BIGINT type
FLOATFloatCorresponds to the MySQL FLOAT type
DOUBLEDoubleCorresponds to the MySQL DOUBLE type
STRINGLONGTEXTCorresponds to the MySQL LONGTEXT type
VARCHARSTRINGCorresponds to the MySQL VARCHAR type
JSONJSONFor differences and similarities in JSON operations, see pyobvector Python SDK API reference.
FLOAT_VECTORVECTORVector type
BINARY_VECTORNot supported
FLOAT16_VECTORNot supported
BFLOAT16_VECTORNot supported
SPARSE_FLOAT_VECTORNot supported
dynamic_fieldNot neededThe hidden $meta metadata column in Milvus.
In seekdb, you can explicitly create a JSON-type column.

Compatibility with Milvus

Milvus SDK

Except load_collection(), release_collection(), and close(), which are supported through SQLAlchemy, all operations listed in the following tables are supported.

Collection operations

InterfaceDescription
create_collection()Creates a vector table based on the given schema.
get_collection_stats()Queries table statistics, such as the number of rows.
describe_collection()Provides detailed metadata of a vector table.
has_collection()Checks whether a table exists.
list_collections()Lists existing tables.
drop_collection()Drops a table.

Field and schema definition

InterfaceDescription
create_schema()Creates a schema in memory and adds column definitions.
add_field()The call sequence is: create_schema->add_field->...->add_field
You can also manually build a FieldSchema list and then use the CollectionSchema constructor to create a schema.

Vector indexes

InterfaceDescription
list_indexes()Lists all indexes.
create_index()Supports creating multiple vector indexes in a single call. First, use prepare_index_params to initialize an index parameter list object, call add_index multiple times to set multiple index parameters, and finally call create_index to create the indexes.
drop_index()Drops a vector index.
describe_index()Gets the metadata (schema) of an index.

Vector indexes

InterfaceDescription
search()ANN query interface:
  • collection_name: the table name
  • data: the query vectors
  • filter: filtering operation, equivalent to WHERE
  • limit: top K
  • output_fields: projected columns, equivalent to SELECT
  • partition_names: partition names (not supported in Milvus Lite)
  • anns_field: the index column name
  • search_params: vector distance function name and index algorithm-related parameters
query()Point query with filter, namely SELECT ... WHERE ids IN (..., ...) AND <filters>.
get()Point query without filter, namely SELECT ... WHERE ids IN (..., ...).
delete()Deletes a group of vectors, DELETE FROM ... WHERE ids IN (..., ...).
insert()Inserts a group of vectors.
upsert()Insert with update on primary key conflict.

Collection metadata synchronization

InterfaceDescription
load_collection()Loads the table structure from the database to the Python application memory, enabling the application to operate the database table in an object-oriented manner. This is a standard feature of an object-relational mapping (ORM) framework.
release_collection()Releases the loaded table structure from the Python application memory and releases related resources. This is a standard feature of an ORM framework for memory management.
close()Closes the database connection and releases related resources. This is a standard feature of an ORM framework.

pymilvus

Data model

The data model of Milvus comprises three levels: Shards->Partitions->Segments. Compatibility with seekdb is described as follows:

  • Shards correspond to seekdb's Partition concept.

  • Partitions currently have no corresponding concept in seekdb.

  • Milvus allows you to partition a shard into blocks by other columns to improve read performance (shards are usually partitioned by primary key). seekdb implements this by sorting by primary key within a partition.

  • Segments are similar to MemTable + SSTable.

Milvus Lite API compatibility

collection operations
  1. Milvus create_collection():

    create_collection(
    collection_name: str,
    dimension: int,
    primary_field_name: str = "id",
    id_type: str = DataType,
    vector_field_name: str = "vector",
    metric_type: str = "COSINE",
    auto_id: bool = False,
    timeout: Optional[float] = None,
    schema: Optional[CollectionSchema] = None, # Used for custom setup
    index_params: Optional[IndexParams] = None, # Used for custom setup
    **kwargs,
    ) -> None

    seekdb compatibility is described as follows:

    • collection_name: compatible, corresponds to table_name.

    • dimension: compatible, vector(dim).

    • primary_field_name: compatible, the primary key column name.

    • id_type: compatible, the primary key column type.

    • vector_field_name: compatible, the vector column name.

    • auto_id: compatible, auto increment.

    • timeout: compatible, seekdb supports it through hint.

    • schema: compatible.

    • index_params: compatible.

  2. Milvus get_collection_stats():

    get_collection_stats(
    collection_name: str,
    timeout: Optional[float] = None
    ) -> Dict

    seekdb compatibility is described as follows:

    • API is compatible.

    • Return value is compatible: { 'row_count': ... }.

  3. Milvus has_collection():

    has_collection(
    collection_name: str,
    timeout: Optional[float] = None
    ) -> Bool

    seekdb is compatible with Milvus has_collection().

  4. Milvus drop_collection():

    drop_collection(collection_name: str) -> None

    seekdb is compatible with Milvus drop_collection().

  5. Milvus rename_collection():

    rename_collection(
    old_name: str,
    new_name: str,
    timeout: Optional[float] = None
    ) -> None

    seekdb is compatible with Milvus rename_collection().

  1. Milvus create_schema():

    create_schema(
    auto_id: bool,
    enable_dynamic_field: bool,
    primary_field: str,
    partition_key_field: str,
    ) -> CollectionSchema

    seekdb compatibility is described as follows:

    • auto_id: whether the primary key column is auto-increment, compatible.

    • primary_field & partition_key_field: compatible.

  2. Milvus add_field():

    add_field(
    field_name: str,
    datatype: DataType,
    is_primary: bool,
    max_length: int,
    element_type: str,
    max_capacity: int,
    dim: int,
    is_partition_key: bool,
    )

    seekdb is compatible with Milvus add_field().

  1. Milvus search():

    search(
    collection_name: str,
    data: Union[List[list], list],
    filter: str = "",
    limit: int = 10,
    output_fields: Optional[List[str]] = None,
    search_params: Optional[dict] = None,
    timeout: Optional[float] = None,
    partition_names: Optional[List[str]] = None,
    **kwargs,
    ) -> List[dict]

    seekdb compatibility is described as follows:

    • filter: string expression. For usage examples, see: Milvus Filtering Explained. It is generally similar to SQL's WHERE expression.

    • search_params:

      • metric_type: compatible.

      • radius & range filter: related to RNN, currently not supported.

      • group_by_field: groups ANN results, currently not supported.

      • max_empty_result_buckets: used for IVF series indexes, currently not supported.

      • ignore_growing: skips incremental data and directly reads baseline index, currently not supported.

    • partition_names: partition read, supported.

    • kwargs:

      • offset: the number of records to skip in search results, currently not supported.

      • round_decimal: rounds results to specified decimal places, currently not supported.

  2. Milvus get():

    get(
    collection_name: str,
    ids: Union[list, str, int],
    output_fields: Optional[List[str]] = None,
    timeout: Optional[float] = None,
    partition_names: Optional[List[str]] = None,
    **kwargs,
    ) -> List[dict]

    seekdb is compatible with Milvus get().

  3. Milvus delete()

    delete(
    collection_name: str,
    ids: Optional[Union[list, str, int]] = None,
    timeout: Optional[float] = None,
    filter: Optional[str] = "",
    partition_name: Optional[str] = "",
    **kwargs,
    ) -> dict

    seekdb is compatible with Milvus delete().

  4. Milvus insert()

    insert(
    collection_name: str,
    data: Union[Dict, List[Dict]],
    timeout: Optional[float] = None,
    partition_name: Optional[str] = "",
    ) -> List[Union[str, int]]

    seekdb is compatible with Milvus insert().

  5. Milvus upsert()

    upsert(
    collection_name: str,
    data: Union[Dict, List[Dict]],
    timeout: Optional[float] = None,
    partition_name: Optional[str] = "",
    ) -> List[Union[str, int]]

    seekdb is compatible with Milvus upsert().

  1. Milvus create_index()

    create_index(
    collection_name: str,
    index_params: IndexParams,
    timeout: Optional[float] = None,
    **kwargs,
    )

    seekdb is compatible with Milvus create_index().

  2. Milvus drop_index()

    drop_index(
    collection_name: str,
    index_name: str,
    timeout: Optional[float] = None,
    **kwargs,
    )

    seekdb is compatible with Milvus drop_index().

Compatibility with MySQL protocol

  • In terms of request initiation: All APIs are implemented through general query SQL, and there are no compatibility issues.

  • In terms of response result set processing: Only processing of new vector data elements needs to be considered. Currently, string and bytes element parsing are supported. Even if the transmission mode of vector data elements changes in the future, compatibility can be achieved by updating the SDK.