query - vector query
The query() method is used to perform vector similarity search to find the most similar documents to the query vector.
This interface is only available when using the Client. For more information about the Client, see Client.
Prerequisites
-
You have installed pyseekdb. For more information about how to install pyseekdb, see Get Started.
-
You have connected to the database. For more information about how to connect to the database, see Client.
-
You have created a collection and inserted data. For more information about how to create a collection and insert data, see create_collection - Create a collection and add - Insert data.
Request parameters
query()
| Parameter | Value type | Required | Description | Example value |
|---|---|---|---|---|
query_embeddings | List[float] or List[List[float]] | Yes | A single vector or a list of vectors for batch queries; if provided, it will be used directly (ignoring embedding_function); if not provided, query_text must be provided, and the collection must have an embedding_function | [1.0, 2.0, 3.0] |
query_texts | str or List[str] | No | A single text or a list of texts for query; if provided, it will be used directly (ignoring embedding_function); if not provided, documents must be provided, and the collection must have an embedding_function | ["my query text"] |
n_results | int | Yes | The number of similar results to return, default is 10 | 3 |
where | dict | No | Metadata filter conditions. | {"category": {"$eq": "AI"}} |
where_document | dict | No | Document filter conditions. | {"$contains": "machine"} |
include | List[str] | No | List of fields to include: ["documents", "metadatas", "embeddings"] | ["documents", "metadatas", "embeddings"] |
The embedding_function used is associated with the collection (set during create_collection() or get_collection()). You cannot override it for each operation.
Request example
import pyseekdb
# Create a client
client = pyseekdb.Client()
collection = client.get_collection("my_collection")
collection1 = client.get_collection("my_collection1")
# Basic vector similarity query (embedding_function not used)
results = collection.query(
query_embeddings=[1.0, 2.0, 3.0],
n_results=3
)
# Iterate over results
for i in range(len(results["ids"][0])):
print(f"ID: {results['ids'][0][i]}, Distance: {results['distances'][0][i]}")
if results.get("documents"):
print(f"Document: {results['documents'][0][i]}")
if results.get("metadatas"):
print(f"Metadata: {results['metadatas'][0][i]}")
# Query by texts - vectors auto-generated by embedding_function
# Requires: collection must have embedding_function set
results = collection1.query(
query_texts=["my query text"],
n_results=10
)
# The collection's embedding_function will automatically convert query_texts to query_embeddings
# Query by multiple texts (batch query)
results = collection1.query(
query_texts=["query text 1", "query text 2"],
n_results=5
)
# Returns dict with lists of lists, one list per query text
for i in range(len(results["ids"])):
print(f"Query {i}: {len(results['ids'][i])} results")
# Query with metadata filter (using query_texts)
results = collection1.query(
query_texts=["AI research"],
where={"category": {"$eq": "AI"}},
n_results=5
)
# Query with comparison operator (using query_texts)
results = collection1.query(
query_texts=["machine learning"],
where={"score": {"$gte": 90}},
n_results=5
)
# Query with document filter (using query_texts)
results = collection1.query(
query_texts=["neural networks"],
where_document={"$contains": "machine learning"},
n_results=5
)
# Query with combined filters (using query_texts)
results = collection1.query(
query_texts=["AI research"],
where={"category": {"$eq": "AI"}, "score": {"$gte": 90}},
where_document={"$contains": "machine"},
n_results=5
)
# Query with multiple vectors (batch query)
results = collection.query(
query_embeddings=[[1.0, 2.0, 3.0], [2.0, 3.0, 4.0]],
n_results=2
)
# Returns dict with lists of lists, one list per query vector
for i in range(len(results["ids"])):
print(f"Query {i}: {len(results['ids'][i])} results")
# Query with specific fields
results = collection.query(
query_embeddings=[1.0, 2.0, 3.0],
include=["documents", "metadatas", "embeddings"],
n_results=3
)
Return parameters
| Parameter | Value type | Required | Description | Example value |
|---|---|---|---|---|
ids | List[List[str]] | Yes | The IDs to add or modify. It can be a single ID or an array of IDs. | item1 |
embeddings | [List[List[List[float]]]] | No | The vectors; if provided, it will be used directly (ignoring embedding_function), if not provided, documents can be provided to generate vectors automatically. | [0.1, 0.2, 0.3] |
documents | [List[List[Dict]]] | No | The documents. If vectors are not provided, documents will be converted to vectors using the embedding_function of the collection. | "Document text" |
metadatas | [List[List[Dict]]] | No | The metadata. | {"category": "AI"} |
distances | [List[List[Dict]]] | No | {"category": "AI"} |
Return example
ID: vec1, Distance: 0.0
Document: None
Metadata: {}
ID: vec2, Distance: 0.025368153802923787
Document: None
Metadata: {}
Query 0: 4 results
Query 1: 4 results
Query 0: 2 results
Query 1: 2 results