get_or_create_collection - Create or query a collection
The get_or_create_collection() function creates or queries a collection. If the collection does not exist in the database, it is created. If it exists, the corresponding result is obtained.
This API is only available when using a client. For more information about the client, see Client.
Prerequisites
-
You have installed pyseekdb. For more information about how to install pyseekdb, see Quick Start.
-
You have connected to the database. For more information about how to connect, see Client.
-
If you are using seekdb in server mode or OceanBase Database, ensure that the connected user has the
CREATEprivilege. For more information about how to check the privileges of the current user, see Check User Privileges. If the user does not have this privilege, contact the administrator to grant it. For more information about how to directly grant privileges, see Directly Grant Privileges.
Define the table name
When creating a table, you must first define its name. The following requirements apply:
-
In seekdb, each table name must be unique within the database.
-
The table name must be between 64 and 512 characters.
-
We recommend that you give the table a meaningful name instead of using generic names such as t1 or table1. For more information about table naming conventions, see Table naming conventions.
Request parameters
get_or_create_collection(name = name, configuration = configuration, embedding_function = embedding_function)
| Parameter | Type | Required | Description | Example value |
|---|---|---|---|---|
name | string | Yes | The name of the collection to create or retrieve. | my_collection |
configuration | Configuration | HNSWConfiguration | No | Index configuration (dimension and distance metric). If not provided, defaults to dimension=384, distance='cosine', and analyzer='ik'. If set to None, the dimension is derived from embedding_function; if embedding_function is also None, a ValueError is raised. | HNSWConfiguration(dimension=384, distance='cosine') |
embedding_function | EmbeddingFunction | No | The function used to convert data into vectors. If not provided, DefaultEmbeddingFunction() (384 dimensions) is used. If set to None, the collection has no embedding function; if provided, the dimension is taken from configuration.dimension. | DefaultEmbeddingFunction() |
When configuration is of type Configuration, you can use the following fields:
| Parameter | Type | Required | Description | Example value |
|---|---|---|---|---|
hnsw | HNSWConfiguration | No | HNSW index configuration. | HNSWConfiguration |
fulltext_config | FulltextIndexConfig | No | Full-text index configuration. Includes analyzer and properties (both strings). If not provided, the default is analyzer='ik'. For details, see the parser_properties section in CREATE INDEX. | FulltextIndexConfig(analyzer='ik', properties={'ik_mode':'max_words'}) |
When embedding_function is provided, the system will automatically calculate the vector dimension by calling the function. If configuration.dimension is also provided, it must match the dimension of embedding_function, otherwise a ValueError will be raised.
Request example
import pyseekdb
from pyseekdb import DefaultEmbeddingFunction, HNSWConfiguration
# Create a client
client = pyseekdb.Client()
# Get or create collection (creates if doesn't exist)
collection = client.get_or_create_collection(
name="my_collection4",
configuration=HNSWConfiguration(dimension=384, distance='cosine'),
embedding_function=DefaultEmbeddingFunction()
)
config = Configuration(hnsw=HNSWConfiguration(dimension=384), fulltext_config=FulltextIndexConfig(analyzer='ik', properties={'ik_mode':'max_words'}))
collection = client.get_or_create_collection(
name="my_collection5",
configuration=config,
embedding_function=None
)
Response parameters
None