Skip to main content

Partial JSON data updates

seekdb supports partial JSON data updates (JSON Partial Update). When only specific fields in a JSON document need to be modified, this feature allows you to update only the changed portions without having to update the entire JSON document.

Limitations

Enable or disable JSON Partial Update

The JSON Partial Update feature in seekdb is disabled by default. It is controlled by the system variable log_row_value_options. For more information, see log_row_value_options.

Here are some examples:

  • Enable the JSON Partial Update feature.

    • Session level:
    SET log_row_value_options="partial_json";
    • Global level:
    SET GLOBAL log_row_value_options="partial_json";
  • Disable the JSON Partial Update feature.

    • Session level:
    SET log_row_value_options="";
    • Global level:
    SET GLOBAL log_row_value_options="";
  • Query the value of log_row_value_options.

    SHOW VARIABLES LIKE 'log_row_value_options';

    The result is as follows:

    +-----------------------+-------+
    | Variable_name | Value |
    +-----------------------+-------+
    | log_row_value_options | |
    +-----------------------+-------+
    1 row in set

JSON expressions for partial updates

In addition to the JSON Partial Update feature switch log_row_value_options, you must use specific expressions to update JSON documents to trigger JSON Partial Update.

The following JSON expressions in seekdb currently support partial updates:

  • json_set or json_replace: updates the value of a JSON field.
  • json_remove: deletes a JSON field.
tip
  1. Ensure that the left operand of the SET assignment clause and the first parameter of the JSON expression are the same and both are JSON columns in the table. For example, in j = json_replace(j, '$.name', 'ab'), the parameter on the left side of the equals sign and the first parameter of the JSON expression json_replace on the right side are both j.
  2. JSON Partial Update is only triggered when the current JSON column data is stored as outrow. Whether data is stored as outrow or inrow is controlled by the lob_inrow_threshold parameter when creating the table. lob_inrow_threshold is used to configure the INROW threshold. When the LOB data size exceeds this threshold, it is stored as OUTROW in the LOB Meta table. The default value is 4 KB.

Examples:

  1. Create a table named json_test.

    CREATE TABLE json_test(pk INT PRIMARY KEY, j JSON);
  2. Insert data.

    INSERT INTO json_test VALUES(1, CONCAT('{"name": "John", "content": "', repeat('x',8), '"}'));

    The result is as follows:

    Query OK, 1 row affected
  3. Query the data in the JSON column j.

    SELECT j FROM json_test;

    The result is as follows:

    +-----------------------------------------+
    | j |
    +-----------------------------------------+
    | {"name": "John", "content": "xxxxxxxx"} |
    +-----------------------------------------+
    1 row in set
  4. Use json_repalce to update the value of the name field in the JSON column.

    UPDATE json_test SET j = json_replace(j, '$.name', 'ab') WHERE pk = 1;

    Result:

    Query OK, 1 row affected
    Rows matched: 1 Changed: 1 Warnings: 0
  5. Query the modified data in JSON column j.

    SELECT j FROM json_test;

    Result:

    +---------------------------------------+
    | j |
    +---------------------------------------+
    | {"name": "ab", "content": "xxxxxxxx"} |
    +---------------------------------------+
    1 row in set
  6. Use json_set to update the value of the name field in the JSON column.

    UPDATE json_test SET j = json_set(j, '$.name', 'cd') WHERE pk = 1;

    Result:

    Query OK, 1 row affected
    Rows matched: 1 Changed: 1 Warnings: 0
  7. Query the modified data in JSON column j.

    SELECT j FROM json_test;

    Result:

    +---------------------------------------+
    | j |
    +---------------------------------------+
    | {"name": "cd", "content": "xxxxxxxx"} |
    +---------------------------------------+
    1 row in set
  8. Use json_remove to delete the name field value in the JSON column.

    UPDATE json_test SET j = json_remove(j, '$.name') WHERE pk = 1;

    Result:

    Query OK, 1 row affected
    Rows matched: 1 Changed: 1 Warnings: 0
  9. Query the modified data in JSON column j.

    SELECT j FROM json_test;

    Result:

    +-------------------------+
    | j |
    +-------------------------+
    | {"content": "xxxxxxxx"} |
    +-------------------------+
    1 row in set

Granularity of updates

JSON data in seekdb is stored based on LOB storage, and LOBs in seekdb are stored in chunks at the underlying level. Therefore, the minimum data amount for each partial update is one LOB chunk. The smaller the LOB chunk, the smaller the amount of data written. A DDL syntax is provided to set the LOB chunk size, which can be specified when creating a column.

Example:

CREATE TABLE json_test(pk INT PRIMARY KEY, j JSON CHUNK '4k');

The chunk size cannot be infinitely small, as too small a size will affect the performance of SELECT, INSERT, and DELETE operations. It is generally recommended to set it based on the average field size of JSON documents. If most fields are very small, you can set it to 1K. To optimize LOB type reads, seekdb stores data smaller than 4K directly as INROW, in which case partial update will not be performed. Partial Update is mainly intended to improve the performance of updating large documents; for small documents, full updates actually perform better.

Rebuild

JSON Partial Update does not impose restrictions on the data length before and after updating a JSON column. When the length of the new value is less than or equal to the length of the old value, the data at the original location is directly replaced with the new data. When the length of the new value is greater than the length of the old value, the new data is appended at the end. seekdb sets a threshold: when the length of the appended data exceeds 30% of the original data length, a rebuild is triggered. In this case, Partial Update is not performed; instead, a full overwrite is performed.

You can use the JSON_STORAGE_SIZE expression to get the actual storage length of JSON data, and JSON_STORAGE_FREE to get the additional storage overhead.

Example:

  1. Enable JSON Partial Update.

    SET log_row_value_options = "partial_json";
  2. Create a test table named json_test.

    CREATE TABLE json_test(pk INT PRIMARY KEY, j JSON CHUNK '1K');
  3. Insert a row of data into the json_test table.

    INSERT INTO json_test VALUES(10 , json_object('name', 'zero', 'age', 100, 'position', 'software engineer', 'profile', repeat('x', 4096), 'like', json_array('a', 'b', 'c'), 'tags', json_array('sql boy', 'football', 'summer', 1), 'money' , json_object('RMB', 10000, 'Dollers', 20000, 'BTC', 100), 'nickname', 'noone'));

    Result:

    Query OK, 1 row affected
  4. Use JSON_STORAGE_SIZE to query the storage size of the JSON column (actual occupied storage space) and JSON_STORAGE_FREE to estimate the storage space that can be freed from the JSON column.

    SELECT JSON_STORAGE_SIZE(j), JSON_STORAGE_FREE(j) FROM json_test WHERE pk = 10;

    Result:

    +----------------------+----------------------+
    | JSON_STORAGE_SIZE(j) | JSON_STORAGE_FREE(j) |
    +----------------------+----------------------+
    | 4335 | 0 |
    +----------------------+----------------------+
    1 row in set

    Since no partial update has been performed, the value of JSON_STORAGE_FREE is 0.

  5. Use json_replace to update the value of the position field in the JSON column, where the length of the new value is less than the length of the old value.

    UPDATE json_test SET j = json_replace(j, '$.position', 'software enginee') WHERE pk = 10;

    Result:

    Query OK, 1 row affected
    Rows matched: 1 Changed: 1 Warnings: 0
  6. Again, use JSON_STORAGE_SIZE to query the storage size of the JSON column and JSON_STORAGE_FREE to estimate the storage space that can be freed from the JSON column.

    SELECT JSON_STORAGE_SIZE(j), JSON_STORAGE_FREE(j) FROM json_test WHERE pk = 10;

    Result:

    +----------------------+----------------------+
    | JSON_STORAGE_SIZE(j) | JSON_STORAGE_FREE(j) |
    +----------------------+----------------------+
    | 4335 | 1 |
    +----------------------+----------------------+
    1 row in set

    After the JSON column data is updated, since the new data is one byte less than the old data, the JSON_STORAGE_FREE result is 1.

  7. Use json_replace to update the value of the position field in the JSON column, where the length of the new value is greater than the length of the old value.

    UPDATE json_test SET j = json_replace(j, '$.position', 'software engineer') WHERE pk = 10;

    Result:

    Query OK, 1 row affected
    Rows matched: 1 Changed: 1 Warnings: 0
  8. Use JSON_STORAGE_SIZE again to query the JSON column storage size, and JSON_STORAGE_FREE to estimate the storage space that can be freed from the JSON column.

    SELECT JSON_STORAGE_SIZE(j), JSON_STORAGE_FREE(j) FROM json_test WHERE pk = 10;

    Result:

    +----------------------+----------------------+
    | JSON_STORAGE_SIZE(j) | JSON_STORAGE_FREE(j) |
    +----------------------+----------------------+
    | 4355 | 19 |
    +----------------------+----------------------+
    1 row in set

    After appending new data to the JSON column, the length of JSON_STORAGE_FREE is 19, indicating that 19 bytes can be freed after a rebuild.