版本：V1.1.0

seekdb Vector 与 Ollama 集成

seekdb 提供了向量类型存储、向量索引、embedding 向量搜索的能力。可以将向量化后的数据存储在 seekdb，供下一步的搜索使用。

Ollama 是一个开源工具，让你可以在本地轻松运行和管理大型语言模型（如 gpt-oss， Gemma 3，DeepSeek-R1， Qwen3 等）。

前提条件

您已经部署了 seekdb。
您的环境中已存在可以使用的数据库和账号，并已对数据库账号授读写权限。
安装 Python 3.11 及以上版本。

安装依赖。

python3 -m pip install cffi pyseekdb requests ollama

步骤一：获取数据库连接信息

联系 seekdb 部署人员或者管理员获取相应的数据库连接串，例如：

mysql -h$host -P$port -u$user_name -p$password -D$database_name

参数说明：

$host：提供 seekdb 连接 IP 地址。
$port：提供 seekdb 连接端口，默认是 2881。
$database_name：需要访问的数据库名称。

提示
连接的用户需要拥有该数据库的 CREATE、INSERT、DROP 和 SELECT 权限。
$user_name：提供数据库连接账户。
$password：提供账户密码。

步骤二：构建您的 AI 助手

设置环境变量

获取 seekdb 连接信息并配置到环境变量中。

export SEEKDB_DATABASE_URL=YOUR_SEEKDB_DATABASE_URL
export SEEKDB_DATABASE_USER=YOUR_SEEKDB_DATABASE_USER
export SEEKDB_DATABASE_DB_NAME=YOUR_SEEKDB_DATABASE_DB_NAME
export SEEKDB_DATABASE_PASSWORD=YOUR_SEEKDB_DATABASE_PASSWORD

安装 ollama 和拉取模型 qwen3-embedding

curl -fsSL https://ollama.ai/install.sh | sh
export PATH="/usr/local/bin:$PATH"
ollama --version
ollama pull qwen3-embedding:latest

示例代码片段

加载数据

这里以 qwen3-embedding 嵌入模型为例。

import requests, os, json, ollama, pyseekdb
from tqdm import tqdm
from pyseekdb import HNSWConfiguration

docs = [
    "OceanBase seekdb (referred to as seekdb) is an AI-native search database.  It unifies relational, vector, text, JSON and GIS in a single engine, enabling hybrid search and in-database AI workflows.",
    
    "Key features include automated workload analysis to capture SQL execution frequency, latency, CPU/IO usage; intelligent index recommendation based on real query patterns; performance bottleneck diagnosis for identifying full table scans, lock contention, and execution plan regression; and support for validating optimization suggestions in test environments.",
    
    "seekdb supports two deployment modes: private deployment for enterprises with strict data security requirements, ensuring data never leaves the internal network; and cloud-based SaaS service for quick onboarding and trial use.",
    
    "The standard workflow includes: 1) Configure database connection with read-only access; 2) Start data collection task (time-range or continuous); 3) Run analysis engine to generate workload profiling; 4) Review diagnostic reports via Web Console; 5) Apply and validate optimizations in production after testing.",
    
    "Typical use cases include rapid root cause identification during database performance degradation; capacity assessment before major promotions; SQL and index tuning after system go-live; and routine health checks for proactive database maintenance.",
    
    "Currently, seekdb primarily supports OceanBase Database in MySQL mode. Future versions will extend support to Oracle mode, as well as mainstream databases including MySQL and PostgreSQL.",
    
    "seekdb adopts a lightweight architecture composed of: Agent (for performance data collection), Analyzer Engine (core analysis and rule matching), Recommendation Module (generates index and optimization suggestions), and Web Console (visualization and report export)."
]

ids, embeddings, documents = [], [], []

def emb_text(text):
    response = ollama.embeddings(model="qwen3-embedding", prompt=text)
    return response["embedding"]

for i, text in enumerate(tqdm(docs, desc="Creating embeddings")):
    embedding = emb_text(text)
    ids.append(f"{i+1}")
    embeddings.append(embedding)
    documents.append(text)

print(f"Successfully processed {len(documents)} texts")

定义表并将数据存入 seekdb

创建一个名为 ollama_seekdb_demo_documents 的表，利用 qwen3-embedding 为每段文本生成嵌入向量，并存入 seekdb：

SEEKDB_DATABASE_HOST = os.getenv('SEEKDB_DATABASE_HOST')
SEEKDB_DATABASE_PORT = int(os.getenv('SEEKDB_DATABASE_PORT', 2881)) 
SEEKDB_DATABASE_USER = os.getenv('SEEKDB_DATABASE_USER')
SEEKDB_DATABASE_DB_NAME = os.getenv('SEEKDB_DATABASE_DB_NAME')
SEEKDB_DATABASE_PASSWORD = os.getenv('SEEKDB_DATABASE_PASSWORD')

client = pyseekdb.Client(host=SEEKDB_DATABASE_HOST, port=SEEKDB_DATABASE_PORT, database=SEEKDB_DATABASE_DB_NAME, user=SEEKDB_DATABASE_USER, password=SEEKDB_DATABASE_PASSWORD)
table_name = "ollama_seekdb_demo_documents"
config = HNSWConfiguration(dimension=4096, distance='l2')  
collection = client.create_collection(
    name=table_name,
    configuration=config,
    embedding_function=None
)

print('- Inserting Data to seekdb...')
collection.add(
    ids=ids,
    embeddings=embeddings,
    documents=documents
)

语义搜索

通过 qwen3-embedding 嵌入模型生成查询文本的嵌入向量，然后根据查询文本的嵌入向量与向量表中的每个嵌入向量的 l2 距离，搜索最相关的文档：

query = 'What is seekdb?'
query_embedding = emb_text(query)

res = collection.query(
    query_embeddings=query_embedding,
    n_results=1
)

print('- The Most Relevant Document and Its Distance to the Query:')
for i, (doc_id, document, distance) in enumerate(zip(
    res['ids'][0], 
    res['documents'][0], 
    res['distances'][0]
)):
    print(f'- ID: {doc_id}')
    print(f'    content: {document}')
    print(f'    distance: {distance:.6f}')

预期结果

- ID: 1
    content: OceanBase seekdb (referred to as seekdb) is an AI-native search database.  It unifies relational, vector, text, JSON and GIS in a single engine, enabling hybrid search and in-database AI workflows.
    distance: 0.675768

前提条件​

步骤一：获取数据库连接信息​

步骤二：构建您的 AI 助手​

设置环境变量​

安装 ollama 和拉取模型 qwen3-embedding​

示例代码片段​

加载数据​

定义表并将数据存入 seekdb​

语义搜索​

预期结果​

Contents