WINDOW clause
Description
In a SELECT query, the WINDOW clause is used to define named windows with specific window specifications. A "window" is also referred to as a Frame. SeekDB supports both ROWS and RANGE Frame semantics. The ROWS Frame is based on physical row offsets, while the RANGE Frame is based on logical value offsets.
To reference a window specification, use an analytic function followed by OVER window_name. Functions used with the OVER clause are called analytic functions. For more information about analytic functions, see the Analytic Functions section in Overview of Functions.
Syntax
window_clause:
WINDOW window_definition_list
window_definition_list:
window_definition [, window_definition ...]
window_definition:
window_name AS ([PARTITION BY expression_list] [ORDER BY order_by_condition_list] [win_window])
expression_list:
expression [, expression ...]
order_by_condition_list:
order_by_condition [, order_by_condition ...]
order_by_condition:
expression [ASC | DESC]
win_window:
{ROWS | RANGE} BETWEEN win_bounding AND win_bounding
| {ROWS | RANGE} win_bounding
win_bounding:
CURRENT ROW
| win_interval {PRECEDING | FOLLOWING}
win_interval:
expression
| INTERVAL expression date_unit
date_unit:
DAY
| DAY_HOUR
| DAY_MICROSECOND
| DAY_MINUTE
| DAY_SECOND
| HOUR
| HOUR_MICROSECOND
| HOUR_MINUTE
| HOUR_SECOND
| MICROSECOND
| MINUTE
| MINUTE_MICROSECOND
| MINUTE_SECOND
| MONTH
| QUARTER
| SECOND
| SECOND_MICROSECOND
| WEEK
| YEAR
| YEAR_MONTH
Parameters
| Parameter | Description |
|---|---|
| WINDOW window_definition_list | Specifies the window definitions for analytic functions. window_definition_list is a list of window definitions. |
| window_definition | Defines the window, including the window name, grouping method, sorting method, and window range. For more information, see window_definition below. |
window_definition
-
window_name: Specifies the name of the window, used to identify the defined window. -
PARTITION BY expression_list: Optional. Specifies the grouping keys for the analytic function. It groups the input data based on the specified expressions and applies the analytic function to each group. This allows for independent calculations within each group instead of across the entire dataset.expression: Represents a column or expression.
-
ORDER BY order_by_condition_list: Optional. Specifies the sorting rules for the grouped result set, defining the sorting method within the window.expression [ASC | DESC]: Represents the sorting expression for the window definition.ASC | DESCare optional parameters that specify the sorting order.ASCindicates ascending (default), andDESCindicates descending.
-
win_window: Optional. Specifies the window range definition, including the boundary conditions forROWSorRANGE. For more information, see win_window below.
win_window
-
{ROWS | RANGE} BETWEEN win_bounding AND win_bounding: Specifies the window boundaries determined by two boundary values. Details are as follows:-
ROWS | RANGE: Specifies the Frame semantics.ROWS: Represents a physical window. After sorting by theORDER BYclause, it calculates the data from the precedingNrows and the followingNrows.RANGE: Represents a logical window. It specifies the range of values corresponding to the current row.
-
win_bounding: Specifies the boundary conditions of the window range, which can be the current row or a relative offset from the current row. For more information, see win_bounding below.
-
-
{ROWS | RANGE} win_bounding: Specifies the window boundaries determined by a single boundary value.
win_bounding
-
CURRENT ROW: Specifies the current row as the window boundary. -
win_interval {PRECEDING | FOLLOWING}: Specifies the relative position of the current row when calculating. It defines the start and end positions of the window. Details are as follows:-
win_interval: Defines the window interval size, which can be a fixed value, unspecified value, decimal, unlimited, or a date unit.-
expression: Specifies the starting position of the window. The value can be an integer representing the row offset or a specific keyword, as follows:UNBOUNDED PRECEDING: Specifies the window start position as unbounded, meaning it begins from the first row.UNBOUNDED FOLLOWING: Specifies the window end position as unbounded, meaning it ends at the last row.CURRENT ROW: Specifies the window start position as the current row.
-
INTERVAL expression date_unit: Defines the time interval, including an expression and a time unit.tip-
In the current version of SeekDB, the
INTERVALclause can only be used withRANGE. -
When using the
ORDER BYclause to define the sorting method for data in the window, ensure that the sorting expression is in date format.
expression: Specifies the interval expression. The value can be any integer representing the number of intervals.date_unit: Specifies the unit of the interval. The value can beYEAR,MONTH,DAY,HOUR, etc.
-
-
-
PRECEDING: Uses the precedingNrows. -
FOLLOWING: Uses the followingNrows.
-
Example
Query the data in the test_tbl1 table and calculate the sum of col5 in each row under different window conditions.
-
Create the
test_tbl1table.CREATE TABLE test_tbl1 (col1 INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
col2 INT,
col3 VARCHAR(50),
col4 DATE,
col5 INT); -
Insert test data.
INSERT INTO test_tbl1(col2, col3, col4, col5)
VALUES(100, 'A1', '2017-01-01', 120),
(100, 'A1', '2018-01-01', 20),
(100, 'A1', '2019-01-01', 100),
(100, 'A1', '2020-01-01', 40),
(100, 'B1', '2021-01-01', 80),
(100, 'B1', '2022-01-01', 60),
(200, 'B1', '2017-01-01', 70),
(200, 'B1', '2018-01-01', 50),
(200, 'C1', '2019-01-01', 90),
(200, 'C1', '2020-01-01', 30),
(200, 'C1', '2021-01-01', 110),
(200, 'C1', '2022-01-01', 10); -
Select the
col1,col3, andcol5columns from thetest_tbl1table and use the analytic function to calculate the sum ofcol5under different window conditions. Finally, sort the result by the values in thecol1column. The window conditions include differentPARTITION BYandORDER BYclauses, as well asROWSandRANGEclauses, as follows:my_window_1has no specific condition. Therefore, the entire table is considered a window.sum1calculates the sum ofcol5in the entire table.my_window_2partitions the data by the values in thecol3column.sum2calculates the sum ofcol5in each partition.my_window_3sorts the data by the values in thecol5column.sum3calculates the sum ofcol5in the current row and all preceding rows.my_window_4sorts the data by the values in thecol5column in thecol3partition.sum4calculates the sum ofcol5in each partition sorted by the values in thecol5column.my_window_5sorts the data by the values in thecol5column in thecol3partition and does not specify the window boundary.sum5calculates the sum ofcol5in each partition sorted by the values in thecol5column.my_window_6sorts the data by the values in thecol5column in thecol3partition and sets the current row as the window boundary.sum6calculates the sum ofcol5in each partition sorted by the values in thecol5column.my_window_7sorts the data by the values in thecol5column in thecol3partition and sets the current row and the preceding row as the window boundary.sum7calculates the sum ofcol5in each partition sorted by the values in thecol5column.my_window_8sorts the data by the values in thecol5column in thecol3partition and sets the preceding row and the following row as the window boundary.sum9calculates the sum ofcol5in each partition sorted by the values in thecol5column.my_window_9sorts the data by the values in thecol4column in thecol3partition and sets the preceding year and the following year as the window boundary.sum10calculates the sum ofcol5in each partition sorted by the values in thecol4column.
SELECT col1, col3, col4, col5,
SUM(col5) OVER my_window_1 AS sum1,
SUM(col5) OVER my_window_2 AS sum2,
SUM(col5) OVER my_window_3 AS sum3,
SUM(col5) OVER my_window_4 AS sum4,
SUM(col5) OVER my_window_5 AS sum5,
SUM(col5) OVER my_window_6 AS sum6,
SUM(col5) OVER my_window_7 AS sum7,
SUM(col5) OVER my_window_8 AS sum8,
SUM(col5) OVER my_window_9 AS sum9
FROM test_tbl1
WINDOW
my_window_1 AS (),
my_window_2 AS (PARTITION BY col3),
my_window_3 AS (ORDER BY col5),
my_window_4 AS (PARTITION BY col3 ORDER BY col5),
my_window_5 AS (PARTITION BY col3 ORDER BY col5 ROWS UNBOUNDED PRECEDING),
my_window_6 AS (PARTITION BY col3 ORDER BY col5 ROWS CURRENT ROW),
my_window_7 AS (PARTITION BY col3 ORDER BY col5 ROWS 1 PRECEDING),
my_window_8 AS (PARTITION BY col3 ORDER BY col5 ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING),
my_window_9 AS (PARTITION BY col3 ORDER BY col4 RANGE BETWEEN INTERVAL 1 YEAR PRECEDING AND INTERVAL 1 YEAR FOLLOWING)
ORDER BY col1;The result set is as follows:
+------+------+------------+------+------+------+------+------+------+------+------+------+------+
| col1 | col3 | col4 | col5 | sum1 | sum2 | sum3 | sum4 | sum5 | sum6 | sum7 | sum8 | sum9 |
+------+------+------------+------+------+------+------+------+------+------+------+------+------+
| 1 | A1 | 2017-01-01 | 10 | 780 | 100 | 10 | 10 | 10 | 10 | 10 | 30 | 30 |
| 2 | A1 | 2018-01-01 | 20 | 780 | 100 | 30 | 30 | 30 | 20 | 30 | 60 | 60 |
| 3 | A1 | 2019-01-01 | 30 | 780 | 100 | 60 | 60 | 60 | 30 | 50 | 90 | 90 |
| 4 | A1 | 2020-01-01 | 40 | 780 | 100 | 100 | 100 | 100 | 40 | 70 | 70 | 70 |
| 5 | B1 | 2021-01-01 | 50 | 780 | 260 | 150 | 50 | 50 | 50 | 50 | 110 | 110 |
| 6 | B1 | 2022-01-01 | 60 | 780 | 260 | 210 | 110 | 110 | 60 | 110 | 180 | 110 |
| 7 | B1 | 2017-01-01 | 70 | 780 | 260 | 280 | 180 | 180 | 70 | 130 | 210 | 150 |
| 8 | B1 | 2018-01-01 | 80 | 780 | 260 | 360 | 260 | 260 | 80 | 150 | 150 | 150 |
| 9 | C1 | 2019-01-01 | 90 | 780 | 420 | 450 | 90 | 90 | 90 | 90 | 190 | 190 |
| 10 | C1 | 2020-01-01 | 100 | 780 | 420 | 550 | 190 | 190 | 100 | 190 | 300 | 300 |
| 11 | C1 | 2021-01-01 | 110 | 780 | 420 | 660 | 300 | 300 | 110 | 210 | 330 | 330 |
| 12 | C1 | 2022-01-01 | 120 | 780 | 420 | 780 | 420 | 420 | 120 | 230 | 230 | 230 |
+------+------+------------+------+------+------+------+------+------+------+------+------+------+
12 rows in set (0.212 sec)