Skip to main content

JOIN clause

Description

The JOIN clause is used to join rows from two or more tables based on matching values in related columns between the tables.

Permissions

When you execute a JOIN statement, the current user must have at least the SELECT permission on the tables involved in the JOIN. For more information about seekdb permissions, see Permission classification.

Syntax

table_references:
table_reference [, table_reference ...]

table_reference:
table_factor
| joined_table

table_factor:
table_name [PARTITION (partition_name_list)] [sample_clause [SEED(integer)]] [[AS] table_alias_name][index_hint_list]
| table_subquery [AS] table_alias_name
| (table_references)

partition_name_list:
partition_name [, partition_name ...]

sample_clause:
SAMPLE [BLOCK] [ALL | BASE | INCR] (sample_percent)

index_hint_list:
index_hint [, index_hint ...]

index_hint:
{USE | FORCE | IGNORE} {KEY | INDEX} [FOR {JOIN | ORDER BY | GROUP BY}] (index_name_list)

index_list:
index_name [, index_name ...]

table_subquery:
(select_stmt [sample_clause [SEED(integer)]])

joined_table:
table_reference [NATURAL] [INNER | CROSS] JOIN table_factor [join_condition]
| table_reference outer_join_type JOIN table_factor join_condition

join_condition:
ON expression
| USING (join_column_list)

join_column_list:
column_name [, column_name ...]

outer_join_type:
[NATURAL] {LEFT | RIGHT | FULL} [OUTER]

Parameter description

ParameterDescription
table_referenceThe table reference, which can be a single table, a joined table, or a subquery.
table_factorThe data source table for the query, which can also be a table subquery or a table reference in parentheses. For more information, see table_factor.
joined_tableThe joined table. For more information, see joined_table.

table_factor

  • table_name [PARTITION (partition_name_list)] [sample_clause [SEED(integer)]] [[AS] table_alias_name] [index_hint_list]:specifies the data source table for the query. The syntax is described as follows:

    • table_name: specifies the name of the table or view from which to retrieve data.

    • PARTITION (partition_name_list): an optional clause that specifies the partitions to query.

      • partition_name_list: a list of partition names. partition_name specifies the name of a partition.

      Here is an example:

      Read data from partitions p0 and p1 in the tbl1 table.

      SELECT * FROM tbl1 PARTITION(p0, p1);

      The result is as follows:

      +------+------+------+
      | col1 | col2 | col3 |
      +------+------+------+
      | 1 | A1 | 1 |
      +------+------+------+
      1 row in set (0.011 sec)
    • sample_clause [SEED(integer)]: specifies the sampling rules for the table, that is, the records to scan. The syntax is described as follows:

      • BLOCK: an optional clause that specifies to scan random data blocks. By default, random rows are scanned.

      • ALL | BASE | INCR: an optional clause that specifies the scan method. The syntax is described as follows:

        • ALL: specifies to scan all data. The default value is ALL.
        • BASE: specifies to scan baseline data.
        • INCR: specifies to scan incremental data.
      • sample_percent: specifies the sampling ratio, in percentage (%). You can use the INTEGER or DECIMAL type.

      • SEED(integer): an optional clause that specifies the seed value for random sampling. The value range is [0,4294967295]. If the same sampling seed value is specified, the same result is returned.

    • [AS] table_alias_name: an optional clause that specifies the table alias.

    • index_hint_list: an optional clause that specifies the index hints for the query. index_hint specifies the index used for the query. The syntax is described as follows:

      • USE | FORCE | IGNORE:

        • USE: specifies to use an index for an operation.
        • FORCE: specifies to force an index for an operation.
        • IGNORE: specifies not to use an index for an operation.
      • KEY | INDEX: specifies the index used for searching and sorting in the query.

      • FOR {JOIN | ORDER BY | GROUP BY}: an optional clause that specifies the operations for which the index is used. The syntax is described as follows:

        • FOR JOIN: specifies to use the index for joining.
        • FOR ORDER BY: specifies to use the index for sorting.
        • FOR GROUP BY: specifies to use the index for grouping.
      • index_name_list: specifies the list of index names to use. You can specify one or more index names. Separate multiple index names with commas (,). index_name specifies the name of an index.

  • table_subquery [AS] table_alias_name: specifies a subquery. For more information about subqueries, see Subqueries.

    Here is an example:

    Select the col1 column from the tbl1 table, name the result as a subquery t1, and finally select all columns from t1.

    SELECT t1.* FROM (SELECT col1 FROM tbl1) t1;

    The result is as follows:

    +------+
    | col1 |
    +------+
    | 1 |
    | 2 |
    | 3 |
    +------+
    3 rows in set (0.020 sec)

joined_table

  • table_reference [NATURAL] [INNER | CROSS] JOIN table_factor [join_condition]:represents an inner join. It returns the intersection of matching rows from both tables. In this join, no results are returned if there are no matching rows. Specifically:

    • table_reference: refers to a table reference, which can be a single table, a joined table, or a subquery.

    • NATURAL: an optional keyword indicating a natural join. NATURAL JOIN automatically joins on columns with the same name.

    • INNER | CROSS: optional keywords. Specifically:

      • INNER JOIN: represents an inner join. When no join condition is specified, INNER JOIN is equivalent to using a comma (,), both of which generate a Cartesian product of the specified tables. For more information about inner joins, see INNER JOIN.

        tip

        The comma (,) as an operator has a lower precedence than INNER JOIN, CROSS JOIN, and LEFT JOIN. Mixing commas with other join keywords can lead to incorrect conditions in the ON clause.

      • CROSS JOIN: in SeekDB, CROSS JOIN is equivalent to JOIN and INNER JOIN.

    • table_factor: refers to the table reference to be joined. For more details, see table_factor.

    • join_condition: optional, representing the join condition. Specifically:

      • ON expression: specifies the join condition for duplicate columns, applicable when columns with different names are used as the join condition. Any join condition can be specified, not limited to equality conditions. For example, conditions like greater than or less than can be used.

        Example:

        SELECT *
        FROM tbl1 JOIN tbl2
        ON tbl1.col1 = tbl2.col1;

        The result is as follows:

        +------+------+------+------+------+------+
        | col1 | col2 | col3 | col1 | col2 | col3 |
        +------+------+------+------+------+------+
        | 1 | A1 | 1 | 1 | A1 | 1 |
        | 2 | A2 | 2 | 2 | A2 | 22 |
        | 3 | A3 | 3 | 3 | A3 | 33 |
        +------+------+------+------+------+------+
        3 rows in set (0.023 sec)
      • USING (join_column_list): specifies the join condition without duplicate columns, applicable only when the joined tables use columns with the same name as the join condition.

        Example:

        SELECT *
        FROM tbl1 JOIN tbl2
        USING (col1);

        The result is as follows:

        +------+------+------+------+------+
        | col1 | col2 | col3 | col2 | col3 |
        +------+------+------+------+------+
        | 1 | A1 | 1 | A1 | 1 |
        | 2 | A2 | 2 | A2 | 22 |
        | 3 | A3 | 3 | A3 | 33 |
        +------+------+------+------+------+
        3 rows in set (0.025 sec)
  • table_reference outer_join_type JOIN table_factor join_condition: represents an outer join. It returns all rows from both tables, regardless of whether there are matches. In this join, unmatched rows are filled with NULL.

    • outer_join_type: specifies the type of outer join. Specifically:

      • {LEFT | RIGHT | FULL} [OUTER] JOIN:

        • LEFT [OUTER] JOIN: represents a left outer join. When a row in the left table does not find a match in the right table, NULL is automatically filled in the right table. For more information about left outer joins, see LEFT JOIN.
        • RIGHT [OUTER] JOIN: represents a right outer join. When a row in the right table does not find a match in the left table, NULL is automatically filled in the left table. For more information about right outer joins, see RIGHT JOIN.
        • FULL [OUTER] JOIN: represents a full outer join. When a row in the left or right table does not find a match, NULL is automatically filled in the corresponding table. For more information about full outer joins, see FULL JOIN.
    • join_condition: represents the join condition.

    Example:

    Select all columns from table tbl1 and left join it with table tbl2 on the condition tbl1.col1 = tbl2.col1, and filter the results using the condition tbl1.col1 > 2.

    SELECT *
    FROM tbl1 LEFT JOIN tbl2
    ON tbl1.col1 = tbl2.col1
    AND tbl1.col1 > 2;

    The result is as follows:

    +------+------+------+------+------+------+
    | col1 | col2 | col3 | col1 | col2 | col3 |
    +------+------+------+------+------+------+
    | 1 | A1 | 1 | NULL | NULL | NULL |
    | 2 | A2 | 2 | NULL | NULL | NULL |
    | 3 | A3 | 3 | 3 | A3 | 33 |
    +------+------+------+------+------+------+
    3 rows in set (0.028 sec)

References