Version: V1.0.0

seekdb Vector integration with Ollama

seekdb provides capabilities for storing vector data, indexing vectors, and performing embedding-based vector searches. You can store vectorized data in seekdb for use in subsequent searches.

Ollama is an open-source tool that allows you to easily run and manage large language models (LLMs) such as gpt-oss, Gemma 3, DeepSeek-R1, and Qwen3 locally.

Prerequisites

You have deployed the seekdb.
Your environment has a usable database, and account, and the database account has read and write permissions.
Python 3.11 or later is installed.

Dependencies are installed.

python3 -m pip install cffi pyseekdb requests ollama

Step 1: Obtain database connection information

Contact the seekdb deployment team or administrator to obtain the database connection string, for example:

mysql -h$host -P$port -u$user_name -p$password -D$database_name

Parameter description:

$host: the IP address for connecting to seekdb.
$port: the port for connecting to seekdb, defaulting to 2881.
$database_name: the name of the database to access.

tip
The user needs to have the CREATE, INSERT, DROP, and SELECT privileges for the database.
$user_name: the database connection account.
$password: the account password.

Step 2: Build your AI assistant

Set environment variables

Obtain the seekdb connection information and configure it in the environment variables.

export SEEKDB_DATABASE_URL=YOUR_SEEKDB_DATABASE_URL
export SEEKDB_DATABASE_USER=YOUR_SEEKDB_DATABASE_USER
export SEEKDB_DATABASE_DB_NAME=YOUR_SEEKDB_DATABASE_DB_NAME
export SEEKDB_DATABASE_PASSWORD=YOUR_SEEKDB_DATABASE_PASSWORD

Install Ollama and pull the Qwen3-embedding model

curl -fsSL https://ollama.ai/install.sh | sh
export PATH="/usr/local/bin:$PATH"
ollama --version
ollama pull qwen3-embedding:latest

Example code snippet

Load data

Here's an example using the Qwen3-embedding model.

import requests, os, json, ollama, pyseekdb
from tqdm import tqdm
from pyseekdb import HNSWConfiguration

docs = [
    "OceanBase seekdb (referred to as seekdb) is an AI-native search database.  It unifies relational, vector, text, JSON and GIS in a single engine, enabling hybrid search and in-database AI workflows.",
    
    "Key features include automated workload analysis to capture SQL execution frequency, latency, CPU/IO usage; intelligent index recommendation based on real query patterns; performance bottleneck diagnosis for identifying full table scans, lock contention, and execution plan regression; and support for validating optimization suggestions in test environments.",
    
    "seekdb supports two deployment modes: private deployment for enterprises with strict data security requirements, ensuring data never leaves the internal network; and cloud-based SaaS service for quick onboarding and trial use.",
    
    "The standard workflow includes: 1) Configure database connection with read-only access; 2) Start data collection task (time-range or continuous); 3) Run analysis engine to generate workload profiling; 4) Review diagnostic reports via Web Console; 5) Apply and validate optimizations in production after testing.",
    
    "Typical use cases include rapid root cause identification during database performance degradation; capacity assessment before major promotions; SQL and index tuning after system go-live; and routine health checks for proactive database maintenance.",
    
    "Currently, seekdb primarily supports OceanBase Database in MySQL mode. Future versions will extend support to Oracle mode, as well as mainstream databases including MySQL and PostgreSQL.",
    
    "seekdb adopts a lightweight architecture composed of: Agent (for performance data collection), Analyzer Engine (core analysis and rule matching), Recommendation Module (generates index and optimization suggestions), and Web Console (visualization and report export)."
]

ids, embeddings, documents = [], [], []

def emb_text(text):
    response = ollama.embeddings(model="qwen3-embedding", prompt=text)
    return response["embedding"]

for i, text in enumerate(tqdm(docs, desc="Creating embeddings")):
    embedding = emb_text(text)
    ids.append(f"{i+1}")
    embeddings.append(embedding)
    documents.append(text)

print(f"Successfully processed {len(documents)} texts")

Define a table and store data in seekdb

Create a table named ollama_seekdb_demo_documents, generate an embedding vector for each text segment using Qwen3-embedding, and store it in seekdb:

SEEKDB_DATABASE_HOST = os.getenv('SEEKDB_DATABASE_HOST')
SEEKDB_DATABASE_PORT = int(os.getenv('SEEKDB_DATABASE_PORT', 2881)) 
SEEKDB_DATABASE_USER = os.getenv('SEEKDB_DATABASE_USER')
SEEKDB_DATABASE_DB_NAME = os.getenv('SEEKDB_DATABASE_DB_NAME')
SEEKDB_DATABASE_PASSWORD = os.getenv('SEEKDB_DATABASE_PASSWORD')

client = pyseekdb.Client(host=SEEKDB_DATABASE_HOST, port=SEEKDB_DATABASE_PORT, database=SEEKDB_DATABASE_DB_NAME, user=SEEKDB_DATABASE_USER, password=SEEKDB_DATABASE_PASSWORD)
table_name = "ollama_seekdb_demo_documents"
config = HNSWConfiguration(dimension=4096, distance='l2')  
collection = client.create_collection(
    name=table_name,
    configuration=config,
    embedding_function=None
)

print('- Inserting Data to seekdb...')
collection.add(
    ids=ids,
    embeddings=embeddings,
    documents=documents
)

Semantic search

Generate an embedding vector for the query text using the Qwen3-embedding model, then search for the most relevant documents based on the L2 distance between the query vector and each vector in the vector table:

query = 'What is seekdb?'
query_embedding = emb_text(query)

res = collection.query(
    query_embeddings=query_embedding,
    n_results=1
)

print('- The Most Relevant Document and Its Distance to the Query:')
for i, (doc_id, document, distance) in enumerate(zip(
    res['ids'][0], 
    res['documents'][0], 
    res['distances'][0]
)):
    print(f'- ID: {doc_id}')
    print(f'    content: {document}')
    print(f'    distance: {distance:.6f}')

Expected result

- ID: 1
    content: OceanBase seekdb (referred to as seekdb) is an AI-native search database.  It unifies relational, vector, text, JSON and GIS in a single engine, enabling hybrid search and in-database AI workflows.
    distance: 0.675768

Prerequisites​

Step 1: Obtain database connection information​

Step 2: Build your AI assistant​

Set environment variables​

Install Ollama and pull the Qwen3-embedding model​

Example code snippet​

Load data​

Define a table and store data in seekdb​

Semantic search​

Expected result​

Contents