seekdb Vector integration with Ollama
seekdb provides capabilities for storing vector data, indexing vectors, and performing embedding-based vector searches. You can store vectorized data in seekdb for use in subsequent searches.
Ollama is an open-source tool that allows you to easily run and manage large language models (LLMs) such as gpt-oss, Gemma 3, DeepSeek-R1, and Qwen3 locally.
Prerequisites
-
You have deployed the seekdb.
-
Your environment has a usable database, and account, and the database account has read and write permissions.
-
Python 3.11 or later is installed.
-
Dependencies are installed.
python3 -m pip install cffi pyseekdb requests ollama
Step 1: Obtain database connection information
Contact the seekdb deployment team or administrator to obtain the database connection string, for example:
mysql -h$host -P$port -u$user_name -p$password -D$database_name
Parameter description:
-
$host: the IP address for connecting to seekdb. -
$port: the port for connecting to seekdb, defaulting to 2881. -
$database_name: the name of the database to access.tipThe user needs to have the
CREATE,INSERT,DROP, andSELECTprivileges for the database. -
$user_name: the database connection account. -
$password: the account password.
Step 2: Build your AI assistant
Set environment variables
Obtain the seekdb connection information and configure it in the environment variables.
export SEEKDB_DATABASE_URL=YOUR_SEEKDB_DATABASE_URL
export SEEKDB_DATABASE_USER=YOUR_SEEKDB_DATABASE_USER
export SEEKDB_DATABASE_DB_NAME=YOUR_SEEKDB_DATABASE_DB_NAME
export SEEKDB_DATABASE_PASSWORD=YOUR_SEEKDB_DATABASE_PASSWORD
Install Ollama and pull the Qwen3-embedding model
curl -fsSL https://ollama.ai/install.sh | sh
export PATH="/usr/local/bin:$PATH"
ollama --version
ollama pull qwen3-embedding:latest
Example code snippet
Load data
Here's an example using the Qwen3-embedding model. We'll use the vector search common questions from seekdb Vector as private knowledge, with each line in the file separated by a "#".
import requests,os,json,ollama,pyseekdb
from tqdm import tqdm
from pyseekdb import HNSWConfiguration
# text URL
url = "https://raw.githubusercontent.com/oceanbase/oceanbase-doc/refs/heads/V4.3.5/en-US/640.ob-vector-search/800.ob-vector-search-faq.md"
response = requests.get(url)
text_lines = []
file_text = response.text
text_lines += file_text.split("# ")
def emb_text(text):
response = ollama.embeddings(model="qwen3-embedding", prompt=text)
return response["embedding"]
ids = []
embeddings = []
documents = []
for i, text in enumerate(tqdm(text_lines, desc="Creating embeddings")):
# Generate the embedding for the text via Ollama API
embedding = emb_text(text)
ids.append(f"{i+1}")
embeddings.append(embedding)
documents.append(text)
print(f"Successfully processed {len(documents)} texts")
Define a table and store data in seekdb
Create a table named ollama_seekdb_demo_documents, generate an embedding vector for each text segment using Qwen3-embedding, and store it in seekdb:
SEEKDB_DATABASE_HOST = os.getenv('SEEKDB_DATABASE_HOST')
SEEKDB_DATABASE_PORT = int(os.getenv('SEEKDB_DATABASE_PORT', 2881))
SEEKDB_DATABASE_USER = os.getenv('SEEKDB_DATABASE_USER')
SEEKDB_DATABASE_DB_NAME = os.getenv('SEEKDB_DATABASE_DB_NAME')
SEEKDB_DATABASE_PASSWORD = os.getenv('SEEKDB_DATABASE_PASSWORD')
client = pyseekdb.Client(host=SEEKDB_DATABASE_HOST, port=SEEKDB_DATABASE_PORT, database=SEEKDB_DATABASE_DB_NAME, user=SEEKDB_DATABASE_USER, password=SEEKDB_DATABASE_PASSWORD)
table_name = "ollama_seekdb_demo_documents"
config = HNSWConfiguration(dimension=4096, distance='l2')
collection = client.create_collection(
name=table_name,
configuration=config,
embedding_function=None
)
print('- Inserting Data to seekdb...')
collection.add(
ids=ids,
embeddings=embeddings,
documents=documents
)
Semantic search
Generate an embedding vector for the query text using the Qwen3-embedding model, then search for the most relevant documents based on the L2 distance between the query vector and each vector in the vector table:
query = 'Which mode supports vector search?'
query_embedding = emb_text(query)
res = collection.query(
query_embeddings=query_embedding,
n_results=1
)
print('- The Most Relevant Document and Its Distance to the Query:')
for i, (doc_id, document, distance) in enumerate(zip(
res['ids'][0],
res['documents'][0],
res['distances'][0]
)):
print(f'- ID: {doc_id}')
print(f' content: {document}')
print(f' distance: {distance:.6f}')
Expected result
- ID: 3
content: In which mode is vector search supported?
Vector search is supported in the MySQL-compatible mode of OceanBase Database. It is not supported in the Oracle-compatible mode.
distance: 0.547024