Skip to main content

Build a knowledge base desktop application based on seekdb

This tutorial guides you through building a MineKB (Mine Knowledge Base, personal local knowledge base) desktop application using seekdb, demonstrating how to implement intelligent Q&A through vector search and large language models.

Overview

Core features of the application:

  • Multi-project management: Supports creating multiple independent knowledge base projects.
  • Document processing: Supports multiple formats such as TXT, MD, PDF, DOC, DOCX, and RTF, with automatic text extraction and vectorization.
  • Intelligent search: Based on seekdb's vector indexes (HNSW), enabling efficient semantic search.
  • Conversational Q&A: Query the knowledge base through AI conversations to obtain accurate answers based on document content.
  • Local storage: All data is stored locally to protect privacy and security.

Reasons for choosing seekdb:

  • Embedded deployment: Embedded as a library in the application, no independent service required.
  • Native vector support: Built-in vector type and HNSW indexes, improving vector search performance by 10-100x.
  • All-in-One: Supports transactions, analytics, and vector search simultaneously, meeting all requirements with one database.
  • SQL interface: Standard SQL syntax, developer-friendly.

Prerequisites

Environment requirements

The following environment is required for developing and running the knowledge base desktop application:

  • Operating system: Linux supported, Ubuntu 20.04+ recommended.
  • Node.js: 16.x+ supported, for frontend development, 18.x LTS recommended.
  • Rust: 1.70+ supported, required by Tauri, 1.75+ recommended.
  • Python: 3.x+ supported, 3.9+ recommended.

Technology stack and dependencies

  • Frontend technology stack (see package.json for details)
    • @tauri-apps/api: Tauri frontend API for calling Rust commands
    • @radix-ui/*: Accessible UI component library
    • react-markdown: Markdown rendering
    • react-syntax-highlighter: Code highlighting
    • lucide-react: Icon library
  • Backend technology stack (see Cargo.toml for details)
    • tauri: Tauri framework core
    • tokio: Async runtime
    • reqwest: HTTP client (for calling AI APIs)
    • pdf-extract, docx-rs: Document parsing
    • nalgebra: Vector computation

Python dependencies (see requirements.txt for details)

seekdb==0.0.1.dev4

Install seekdb

Ensure seekdb is installed and verify the installation:

pip install seekdb -i https://pypi.tuna.tsinghua.edu.cn/simple/
# Verify installation:
python3 -c "import seekdb; print(seekdb.__version__)"

API key configuration

MineKB requires Alibaba Cloud Bailian API to provide embedding and LLM services. Register an Alibaba Cloud Bailian account, enable model services, and obtain an API key.

After obtaining the API key, fill it in the configuration file: src-tauri/config.json

{
"api": {
"dashscope": {
"api_key": "<sk-your-api-key-here>",
"base_url": "https://dashscope.aliyuncs.com/api/v1",
"embedding_model": "text-embedding-v1",
"chat_model": "qwen-plus"
}
},
"database": {
"path": "./mine_kb.db",
"name": "mine_kb"
}
}
tip
  • Qwen LLM provides a certain amount of free usage quota. Please monitor your free quota usage during use, as exceeding it will incur charges.
  • This tutorial uses Qwen LLM as an example to introduce how to build a Q&A bot. You can also choose to use other LLMs. If you use another LLM, you need to update the apiKey, model, and baseUrl parameters in the src-tauri/config.example.json file.

Run the application locally

Step 1: Build and start

  1. Clone the project and install dependencies.

    # Clone the project
    git clone https://github.com/ob-labs/mine-kb.git
    cd mine-kb

    # Install frontend dependencies
    npm install
    # Install Python dependencies
    pip install seekdb==0.0.1.dev4 -i https://pypi.tuna.tsinghua.edu.cn/simple/
  2. Configure the API key.

    cp src-tauri/config.example.json src-tauri/config.json
    # Edit the configuration file and fill in your API key
    nano src-tauri/config.json
  3. Start the application.

    npm run tauri:dev

When a user starts the MineKB application, the system executes the following initialization flow in sequence:

  • Application initialization (see src-tauri/src/main.rs for code details)
    • Initialize the logging system
    • Determine the application data directory
    • Load the configuration file
    • Initialize the Python environment
    • Initialize the seekdb database
    • Initialize the database schema
    • Create application state
    • Start the Tauri application
  • Frontend initialization (see src/main.tsx for code details)
    • Mount the React application
    • Call the list_projects command to get the project list
    • Render the project panel and conversation panel
    • Wait for user operations

Step 2: Create a knowledge base

We recommend using seekdb documentation for testing. Click here.

After the user clicks the Create Project button, the system executes the following flow:

  • Frontend interaction implementation
    • See ProjectPanel.tsx for code details
  • Backend processing implementation
    • See commands/projects.rs for code details
  • Database operations
    • See services/project_service.rs for code details
  • Database layer (seekdb_adapter.rsPython Bridgeseekdb), code as follows:
    # Python bridge receives command
    {
    "command": "execute",
    "params": {
    "sql": "INSERT INTO projects (...) VALUES (?, ?, ?, ?, ?, ?, ?)",
    "values": ["uuid-here", "My Project", "Description", "active", 0, "2025-11-05T...", "2025-11-05T..."]
    }
    }

    # Convert to seekdb SQL
    cursor.execute("""
    INSERT INTO projects (id, name, description, status, document_count, created_at, updated_at)
    VALUES ('uuid-here', 'My Project', 'Description', 'active', 0, '2025-11-05T...', '2025-11-05T...')
    """)
    conn.commit()

    # Return success response
    {
    "status": "success",
    "data": null
    }

In summary, creating a knowledge base performs the following tasks:

  1. Generate a unique project ID (UUID v4).
  2. Validate the project name (non-empty, no duplicates).
  3. Initialize the project status as Active.
  4. Record creation time and update time.
  5. Write project information to the projects table in seekdb.
  6. Commit the transaction to ensure data persistence.
  7. Return project information to the frontend.
  8. Frontend updates the project list and displays the new project.

Step 3: Start a conversation

After the user enters a question in the dialog box, the system executes the following flow:

  • Frontend sends message
    • See ChatPanel.tsx for code details
  • Backend processing
    • See commands/chat.rs for code details
  • Vector search
    • See services/vector_db.rs for code details
  • LLM streaming call
    • See services/llm_client.rs for code details

In summary, starting a conversation performs the following tasks:

  1. User enters a question.
  2. Save the user message to the database.
  3. Call Alibaba Cloud Bailian API to generate a query vector (1536-dimensional).
  4. Execute vector search in seekdb (using HNSW index).
  5. Get the top 20 most similar document chunks.
  6. Calculate similarity scores and filter (threshold 0.3).
  7. Use relevant documents as context.
  8. Build a prompt (context + user question).
  9. Streamingly call LLM to generate an answer.
  10. Send the answer to the frontend in real-time for display.
  11. Save the AI reply and source information to the database.
  12. Update the last update time of the conversation.

Summary

Advantages of seekdb in desktop application development

Through the MineKB project practice, seekdb demonstrates the following significant advantages in building desktop applications.

High development efficiency

Comparison itemTraditional solutionseekdb solution
Database deploymentRequires installing and configuring an independent serviceEmbedded, no installation required
Vector search implementationManually implement vector indexes and search algorithmsNative HNSW indexes, ready to use
Data managementManage relational data and vector data separatelyUnified management, SQL interface
Cross-platform supportNeed to compile/package database for different platformspip install automatically adapts to platform

Excellent performance

Vector search performance test (10,000 document chunks, 1536-dimensional vectors):

Operationseekdb (HNSW)SQLite (manual search)Improvement
Top-10 search15ms1200ms80x
Top-20 search25ms2500ms100x
Top-50 search45msCannot complete

Reason analysis:

  • HNSW index: O(log N) complexity, efficient search.
  • Native vector type support: No serialization overhead, improved performance.
  • Columnar storage optimization: Only read required fields, reducing I/O.

Data privacy and security

FeatureDescriptionValue
Local storageDatabase files stored on user devicesZero privacy leakage
No network requiredAll operations offline except AI conversationsSensitive documents not uploaded
User controlUsers can backup and migrate database filesData ownership belongs to users
ACID transactionsEnsures data consistencyNo data loss

All-in-One capabilities

seekdb's integrated capabilities provide unlimited possibilities for future expansion:

  • Relational data management: Projects, documents, sessions, etc.
  • Transaction support: ACID features.
  • Vector search: Semantic search.
  • Full-text search: Use seekdb's FULLTEXT INDEX.
  • Hybrid search: Combines semantic search and keyword search.
  • Analytical queries: Use OLAP capabilities for knowledge statistics.
  • External table queries: Directly query external files such as CSV.
  • Smooth upgrade: Data can be migrated to OceanBase distributed edition.

MineKB project summary

Through the MineKB project, we have proven that seekdb + Tauri is an excellent combination for building AI-Native desktop applications.

Key success factors:

  1. seekdb: Provides powerful vector search capabilities.
  2. Tauri: Provides a lightweight cross-platform desktop application framework.
  3. Python Bridge: Achieves seamless integration between Rust and seekdb.
  4. RAG architecture: Fully leverages the advantages of vector search.

Applicable scenarios:

  • Personal knowledge base management
  • Enterprise document retrieval systems
  • AI-assisted programming tools
  • Study notes and research assistants
  • Any desktop application that requires semantic search