Fork a table
FORK TABLE is a table-level feature in seekdb that lets you create an isolated copy of a source table at a consistent snapshot. The snapshot timestamp is selected by the system. The forked (destination) table is logically equivalent to the source table as of that snapshot, and can be read and written independently.
This topic introduces what FORK TABLE is, when to use it, and a quick example. For statement syntax, parameters, and the complete set of restrictions, see FORK TABLE SQL reference.
In seekdb V1.1.0, FORK TABLE is provided as an experimental feature and is not recommended for production use.
Overview
During development and iteration, you may need to test changes against the same data baseline, for example, data fixes, feature engineering, or changes to indexing and metadata strategies. Traditional approaches often rely on full data copy (such as CREATE TABLE ... AS SELECT ... or export/import), which can be time-consuming and resource-intensive for large tables.
FORK TABLE is designed to reduce the cost of creating a data branch while preserving clear consistency and isolation semantics.
Use cases
Fork Table is commonly used in the following scenarios:
- Data versioning and branching: Create multiple branches from the same baseline to validate data changes, feature engineering, and index/metadata strategy changes in parallel. You can roll back quickly by stopping the branch and returning to the original table, or promote a branch to become the new default (the exact switch-over procedure depends on your operational workflow).
- A/B testing and sandbox validation: Create an isolated copy from a production snapshot to safely validate different prompt strategies, AI coding-generated logic, or LLM inference behavior without impacting the online table.
- Vibe coding and synthetic data iteration: Use the forked copy to generate or modify schema and data, quickly iterate on synthetic datasets, and validate whether the generated data is reasonable.
How it works
FORK TABLE provides the following semantics:
- Snapshot consistency: The destination table reflects a consistent snapshot of the source table at the fork time. Subsequent changes to the source table do not affect the destination table.
- Read/write isolation: The destination table is independent of the source table. Writes to either table do not affect the other.
- Progressive availability: The destination table can become available before the system finishes background work. This does not change the user-visible consistency or isolation semantics.
Limitations and considerations
In seekdb V1.1.0, FORK TABLE has the following limitations and considerations:
- Mutual exclusion with DDL: During a fork operation, DDL on the source or destination table is mutually exclusive with the fork. Avoid running
CREATE/ALTER/DROP/RENAMEconcurrently on the involved tables. - The destination table must not exist: If the destination table already exists, the statement returns an error. Drop the existing table first.
- Unsupported object types: Fork is not supported for some object types, including internal tables, temporary tables, materialized views, triggers, foreign keys, and tables in the recycle bin.
- Index-related limitations:
- Semantic indexes, IVF indexes, and spatial indexes are not supported.
- Fork is not supported for partitioned tables that have global indexes.
- Indexes cannot be built as part of the fork build process. If you need to change indexing strategy, create or adjust indexes on the destination table after the fork completes.
- If the source table contains an HNSW vector index, the fork may increase memory usage. Run a small-scale validation first and plan capacity accordingly.
- Storage format limitation: Fork is not supported for column-store tables.
Example: Create a forked table and verify isolation
This example shows a minimal end-to-end workflow: create a table, fork it, verify schema and row counts, and validate write isolation.
-
Create the source table and insert one row.
DROP TABLE IF EXISTS t1;
DROP TABLE IF EXISTS t1_fork;
CREATE TABLE t1 (
c1 INT PRIMARY KEY,
c2 INT
);
INSERT INTO t1 VALUES (1, 10); -
Fork the table.
FORK TABLE t1 TO t1_fork; -
Verify that the forked table exists and matches the baseline.
SHOW TABLES LIKE 't1%';
SHOW CREATE TABLE t1_fork;
SELECT COUNT(*) FROM t1;
SELECT COUNT(*) FROM t1_fork; -
Validate write isolation: writing to the destination table does not affect the source table.
INSERT INTO t1_fork VALUES (2, 200);
SELECT * FROM t1_fork;
SELECT * FROM t1;
FAQ
Q1: After the fork completes, do future changes to the source table affect the destination table?
No. The destination table is created from a consistent snapshot at fork time. Subsequent changes to the source table do not affect the destination table.
Q2: Do writes to the destination table affect the source table?
No. The destination table is independent of the source table. Writes to either table do not affect the other.
Q3: Why can the destination table be used before the fork fully finishes?
The system prioritizes making the destination table available. Background work may continue, but the user-visible snapshot and isolation semantics remain consistent. And FORK TABLE tries to reuse existing data organization and storage structures during the creation process. It can usually maintain a fast creation speed in large table scenarios.
Q4: How is FORK TABLE different from a traditional table copy?
FORK TABLE is designed to avoid a full data copy by reusing existing data organization and storage structures whenever possible, while preserving snapshot consistency and isolation semantics.