Build a knowledge base desktop application based on seekdb
This tutorial guides you through building a MineKB (Mine Knowledge Base, personal local knowledge base) desktop application using seekdb, demonstrating how to implement intelligent Q&A through vector search and large language models.
Overview
Core features of the application:
- Multi-project management: Supports creating multiple independent knowledge base projects.
- Document processing: Supports multiple formats such as TXT, MD, PDF, DOC, DOCX, and RTF, with automatic text extraction and vectorization.
- Intelligent search: Based on seekdb's vector indexes (HNSW), enabling efficient semantic search.
- Conversational Q&A: Query the knowledge base through AI conversations to obtain accurate answers based on document content.
- Local storage: All data is stored locally to protect privacy and security.
Reasons for choosing seekdb:
- Embedded deployment: Embedded as a library in the application, no independent service required.
- Native vector support: Built-in vector type and HNSW indexes, improving vector search performance by 10-100x.
- All-in-One: Supports transactions, analytics, and vector search simultaneously, meeting all requirements with one database.
- SQL interface: Standard SQL syntax, developer-friendly.
Prerequisites
Environment requirements
The following environment is required for developing and running the knowledge base desktop application:
- Operating system: Linux supported, Ubuntu 20.04+ recommended.
- Node.js: 16.x+ supported, for frontend development, 18.x LTS recommended.
- Rust: 1.70+ supported, required by Tauri, 1.75+ recommended.
- Python: 3.x+ supported, 3.9+ recommended.
Technology stack and dependencies
- Frontend technology stack (see
package.jsonfor details)@tauri-apps/api: Tauri frontend API for calling Rust commands@radix-ui/*: Accessible UI component libraryreact-markdown: Markdown renderingreact-syntax-highlighter: Code highlightinglucide-react: Icon library
- Backend technology stack (see
Cargo.tomlfor details)tauri: Tauri framework coretokio: Async runtimereqwest: HTTP client (for calling AI APIs)pdf-extract,docx-rs: Document parsingnalgebra: Vector computation
Python dependencies (see requirements.txt for details)
seekdb==0.0.1.dev4
Install seekdb
Ensure seekdb is installed and verify the installation:
pip install seekdb -i https://pypi.tuna.tsinghua.edu.cn/simple/
# Verify installation:
python3 -c "import seekdb; print(seekdb.__version__)"
API key configuration
MineKB requires Alibaba Cloud Bailian API to provide embedding and LLM services. Register an Alibaba Cloud Bailian account, enable model services, and obtain an API key.
After obtaining the API key, fill it in the configuration file: src-tauri/config.json
{
"api": {
"dashscope": {
"api_key": "<sk-your-api-key-here>",
"base_url": "https://dashscope.aliyuncs.com/api/v1",
"embedding_model": "text-embedding-v1",
"chat_model": "qwen-plus"
}
},
"database": {
"path": "./mine_kb.db",
"name": "mine_kb"
}
}
- Qwen LLM provides a certain amount of free usage quota. Please monitor your free quota usage during use, as exceeding it will incur charges.
- This tutorial uses Qwen LLM as an example to introduce how to build a Q&A bot. You can also choose to use other LLMs. If you use another LLM, you need to update the
apiKey,model, andbaseUrlparameters in thesrc-tauri/config.example.jsonfile.
Run the application locally
Step 1: Build and start
-
Clone the project and install dependencies.
# Clone the project
git clone https://github.com/ob-labs/mine-kb.git
cd mine-kb
# Install frontend dependencies
npm install
# Install Python dependencies
pip install seekdb==0.0.1.dev4 -i https://pypi.tuna.tsinghua.edu.cn/simple/ -
Configure the API key.
cp src-tauri/config.example.json src-tauri/config.json
# Edit the configuration file and fill in your API key
nano src-tauri/config.json -
Start the application.
npm run tauri:dev
When a user starts the MineKB application, the system executes the following initialization flow in sequence:
- Application initialization (see
src-tauri/src/main.rsfor code details)- Initialize the logging system
- Determine the application data directory
- Load the configuration file
- Initialize the Python environment
- Initialize the seekdb database
- Initialize the database schema
- Create application state
- Start the Tauri application
- Frontend initialization (see
src/main.tsxfor code details)- Mount the React application
- Call the list_projects command to get the project list
- Render the project panel and conversation panel
- Wait for user operations
Step 2: Create a knowledge base
We recommend using seekdb documentation for testing. Click here.
After the user clicks the Create Project button, the system executes the following flow:
- Frontend interaction implementation
- See
ProjectPanel.tsxfor code details
- See
- Backend processing implementation
- See
commands/projects.rsfor code details
- See
- Database operations
- See
services/project_service.rsfor code details
- See
- Database layer (
seekdb_adapter.rs→Python Bridge→seekdb), code as follows:# Python bridge receives command
{
"command": "execute",
"params": {
"sql": "INSERT INTO projects (...) VALUES (?, ?, ?, ?, ?, ?, ?)",
"values": ["uuid-here", "My Project", "Description", "active", 0, "2025-11-05T...", "2025-11-05T..."]
}
}
# Convert to seekdb SQL
cursor.execute("""
INSERT INTO projects (id, name, description, status, document_count, created_at, updated_at)
VALUES ('uuid-here', 'My Project', 'Description', 'active', 0, '2025-11-05T...', '2025-11-05T...')
""")
conn.commit()
# Return success response
{
"status": "success",
"data": null
}
In summary, creating a knowledge base performs the following tasks:
- Generate a unique project ID (UUID v4).
- Validate the project name (non-empty, no duplicates).
- Initialize the project status as Active.
- Record creation time and update time.
- Write project information to the
projectstable in seekdb. - Commit the transaction to ensure data persistence.
- Return project information to the frontend.
- Frontend updates the project list and displays the new project.
Step 3: Start a conversation
After the user enters a question in the dialog box, the system executes the following flow:
- Frontend sends message
- See
ChatPanel.tsxfor code details
- See
- Backend processing
- See
commands/chat.rsfor code details
- See
- Vector search
- See
services/vector_db.rsfor code details
- See
- LLM streaming call
- See
services/llm_client.rsfor code details
- See
In summary, starting a conversation performs the following tasks:
- User enters a question.
- Save the user message to the database.
- Call Alibaba Cloud Bailian API to generate a query vector (1536-dimensional).
- Execute vector search in seekdb (using HNSW index).
- Get the top 20 most similar document chunks.
- Calculate similarity scores and filter (threshold 0.3).
- Use relevant documents as context.
- Build a prompt (context + user question).
- Streamingly call LLM to generate an answer.
- Send the answer to the frontend in real-time for display.
- Save the AI reply and source information to the database.
- Update the last update time of the conversation.
Summary
Advantages of seekdb in desktop application development
Through the MineKB project practice, seekdb demonstrates the following significant advantages in building desktop applications.
High development efficiency
| Comparison item | Traditional solution | seekdb solution |
|---|---|---|
| Database deployment | Requires installing and configuring an independent service | Embedded, no installation required |
| Vector search implementation | Manually implement vector indexes and search algorithms | Native HNSW indexes, ready to use |
| Data management | Manage relational data and vector data separately | Unified management, SQL interface |
| Cross-platform support | Need to compile/package database for different platforms | pip install automatically adapts to platform |
Excellent performance
Vector search performance test (10,000 document chunks, 1536-dimensional vectors):
| Operation | seekdb (HNSW) | SQLite (manual search) | Improvement |
|---|---|---|---|
| Top-10 search | 15ms | 1200ms | 80x |
| Top-20 search | 25ms | 2500ms | 100x |
| Top-50 search | 45ms | Cannot complete | ∞ |
Reason analysis:
- HNSW index: O(log N) complexity, efficient search.
- Native vector type support: No serialization overhead, improved performance.
- Columnar storage optimization: Only read required fields, reducing I/O.
Data privacy and security
| Feature | Description | Value |
|---|---|---|
| Local storage | Database files stored on user devices | Zero privacy leakage |
| No network required | All operations offline except AI conversations | Sensitive documents not uploaded |
| User control | Users can backup and migrate database files | Data ownership belongs to users |
| ACID transactions | Ensures data consistency | No data loss |
All-in-One capabilities
seekdb's integrated capabilities provide unlimited possibilities for future expansion:
- Relational data management: Projects, documents, sessions, etc.
- Transaction support: ACID features.
- Vector search: Semantic search.
- Full-text search: Use seekdb's
FULLTEXT INDEX. - Hybrid search: Combines semantic search and keyword search.
- Analytical queries: Use OLAP capabilities for knowledge statistics.
- External table queries: Directly query external files such as CSV.
- Smooth upgrade: Data can be migrated to OceanBase distributed edition.
MineKB project summary
Through the MineKB project, we have proven that seekdb + Tauri is an excellent combination for building AI-Native desktop applications.
Key success factors:
- seekdb: Provides powerful vector search capabilities.
- Tauri: Provides a lightweight cross-platform desktop application framework.
- Python Bridge: Achieves seamless integration between Rust and seekdb.
- RAG architecture: Fully leverages the advantages of vector search.
Applicable scenarios:
- Personal knowledge base management
- Enterprise document retrieval systems
- AI-assisted programming tools
- Study notes and research assistants
- Any desktop application that requires semantic search