Skip to main content

Character Expressions, Character Sets, and Character Sequences

Each character expression has a character set and collation.

Specify the character set and collation for the CHAR expression

The character set and collation for the string expression in the SELECT 'string' statement are defined by the character_set_connection and collation_connection system variables.

The COLLATE clause can be used to specify the character set and collation of a character expression. The syntax is as follows:

[_charset_name]'string' [COLLATE collation_name]

For example:

SELECT _utf8mb4'abc' COLLATE utf8mb4_unicode_ci;
+------------------------------------------+
| _utf8mb4'abc' COLLATE utf8mb4_unicode_ci |
+------------------------------------------+
| abc |
+------------------------------------------+
1 row in set (0.001 sec)

When you specify the character set of a constant in an SQL statement, you can add _gb18030_2022 in front of the hex number to convert the encoding to the corresponding characters under the gb18030_2022 character set. You can also add _gb18030_2022 in front of a string to specify the character set of the string as gb18030_2022.

SELECT _gb18030_2022 0xCDE5 AS c FROM DUAL;
+-----+
| c |
+-----+
| |
+-----+
1 row in set (0.001 sec)

SELECT _gb18030_2022 'Bay' AS c FROM DUAL;
+-----+
| c |
+-----+
| |
+-----+
1 row in set, 1 warning (0.000 sec)

Character set and collation for the character expression

The seekdb statement selects the character set and collation for a character expression based on the following rules:

  • If you specify both _charset_name and COLLATE collation_name, the character set charset_name and collation collation_name are used.

  • If _charset_name is specified but COLLATE is not, the character set charset_name and its default collation are used. Use the SHOW CHARACTER SET statement to view the default collation for each character set.

  • If no _charset_name is specified but a COLLATE collation_name is specified, the default character set is determined by the value of the character_set_connection system variable and collation collation_name. collation_name must be a collation supported by the default character set.

  • If you do not specify CHARACTER SET or COLLATE, then the default character set is specified by the character_set_connection system variable and collation collation_name.