Data type conversion rules
Type conversion during expression evaluation
When an operator is used with operands of different types in an expression, type conversion is performed to make the operands compatible.
Conversion methods
seekdb supports explicit and implicit data type conversion.
Implicit data type conversion occurs when an operation requires a parameter of a specified data type, but the actual parameter value provided is not of that data type. In such cases, seekdb automatically converts the actual parameter value to the specified data type before proceeding with the operation.
For example, seekdb can automatically convert a string to a number and vice versa when needed.
SELECT 1+'1';
+-------+
| 1+'1' |
+-------+
| 2 |
+-------+
1 row in set (0.001 sec)
Explicit data type conversion in seekdb is achieved using the CAST function. Here's an example:
SELECT 31.4, CAST(31.4 AS TIME);
+------+--------------------+
| 31.4 | CAST(31.4 AS TIME) |
+------+--------------------+
| 31.4 | 00:00:31 |
+------+--------------------+
1 row in set (0.001 sec)
For more information about the CAST function, see CAST.
Conversion rules
Conversion rules for comparison operators
-
If one or both parameters are
NULL, the comparison result isNULL, except for the<=>comparison operator. ForNULL <=> NULL, the result is true, and no conversion is required. -
If both parameters are strings, they are compared as strings.
-
If both parameters are integers, they are compared as integers.
-
If not compared with a number, hexadecimal values are treated as binary strings.
-
Single-row subqueries from one or more tables are not considered constants. For example, if a subquery returns an integer to be compared with a
DATETIMEvalue, both operands are treated as integers. Since integers cannot be converted to time values, to compare the operands asDATETIMEvalues, useCAST()to explicitly convert the subquery value toDATETIME.If one parameter is a
TIMESTAMPorDATETIMEcolumn and the other is a constant, the constant is converted to a timestamp before the comparison. For safety, always use complete date-time, date, or time strings when performing comparisons. For example, when usingBETWEENwith date or time values, useCAST()to explicitly convert the values to the required data type for optimal results. -
If one parameter is a decimal value, the comparison type depends on the other parameter. If the other parameter is a decimal or integer value, the parameters are compared as decimal values. If the other parameter is a floating-point value, the parameters are compared as floating-point values.
-
In all other cases, the parameters are compared as floating-point (real) numbers. For example, string and numeric operands are compared as floating-point numbers.
Comparisons between floating-point numbers and large values of the
INTEGERtype are approximate, because integers are converted to double-precision floating-point numbers before the comparison, and not all 64-bit integers can be accurately represented.
Type downgrading
To achieve efficient comparisons, constants are aligned with column types as much as possible while maintaining the semantics of the comparison. Currently, most comparisons between columns and constants are supported under standard semantics. Remaining comparisons that require casting are listed separately:
- For
intoruintcolumns, constants of character type,decimaltype, and floating-point types (double/float) can be downgraded to integer columns. - For
decimalcolumns, floating-point types (double/float) can be downgraded todecimalcolumns. - For
yearcolumns, numeric types (including integers, floating-point types, anddecimaltype), character types, and time types (includingdate,datetime,timestamp, andtime) can be downgraded toyearcolumns. - For
datecolumns, character types,datetime,timestamp, andtimetypes can be downgraded todatecolumns. - For
timestampcolumns,datetimecan be downgraded totimestamp.
The following scenarios involve casting on columns (excluding complex types such as arrays, XML, and JSON):
- For
floatcolumns, comparisons are always performed usingdouble, so casting is added when comparing with any constant. - For
fixed doublecolumns, such as those defined withdouble(10, 2), if the constant has more decimal places, the precision is aligned during the comparison, and casting is added to the column. - For character columns, comparisons between character and numeric types are performed based on numeric order, and the column is cast to a numeric type. Additionally, casting may be added to the column based on the character set comparison rules.
Non-standard comparisons
The logic and type downgrading for non-standard comparisons are similar. They only handle comparisons between constants and columns. If two columns are being compared, no action is taken. Currently, only comparisons between integer constants and character columns are supported. Note that if range is set, the sorting results may not conform to standard semantics. Therefore, the setting should be chosen based on the data characteristics.
Here are some examples:
-
Create a table named
t1with abiz_daycolumn of typevarchar(10).CREATE TABLE t1 (biz_day varchar(10));
Query OK, 0 rows affected (0.216 sec) -
View the execution plan under default settings.
EXPLAIN SELECT * FROM t1 WHERE biz_day BETWEEN 20200101 AND 20200201;
+----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Query Plan |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| =============================================== |
| |ID|OPERATOR |NAME|EST.ROWS|EST.TIME(us)| |
| ----------------------------------------------- |
| |0 |TABLE FULL SCAN|t1 |1 |3 | |
| =============================================== |
| Outputs & filters: |
| ------------------------------------- |
| 0 - output([t1.biz_day]), filter([cast(t1.biz_day, DECIMAL(-1, -1)) >= cast(20200101, DECIMAL(20, 0))], [cast(t1.biz_day, DECIMAL(-1, -1)) <= cast(20200201, |
| DECIMAL(20, 0))]), rowset=16 |
| access([t1.biz_day]), partitions(p0) |
| is_index_back=false, is_global_index=false, filter_before_indexback[false,false], |
| range_key([t1.__pk_increment]), range(MIN ; MAX)always true |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------+
12 rows in set (0.003 sec)- Query plan: By default, the query optimizer converts the
biz_daycolumn from a string type to a numeric type before the comparison. - Filter condition:
filter([cast(t1.biz_day, DECIMAL(-1, -1)) >= 20200101], [cast(t1.biz_day, DECIMAL(-1, -1)) <= 20200201])indicates that thebiz_daycolumn is converted to a numeric type for range comparison.
- Query plan: By default, the query optimizer converts the
-
Set
non_standard_comparison_leveltoequaland view the execution plan.EXPLAIN SELECT /*+opt_param('non_standard_comparison_level', 'equal')*/ * FROM t1 WHERE biz_day BETWEEN 20200101 AND 20200201;
+----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Query Plan |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ====================================================== |
| |ID|OPERATOR |NAME|EST.ROWS|EST.TIME(us)| |
| ------------------------------------------------------ |
| |0 |COLUMN TABLE FULL SCAN|t1 |1 |3 | |
| ====================================================== |
| Outputs & filters: |
| ------------------------------------- |
| 0 - output([t1.biz_day]), filter([cast(t1.biz_day, DECIMAL(-1, -1)) >= cast(20200101, DECIMAL(20, 0))], [cast(t1.biz_day, DECIMAL(-1, -1)) <= cast(20200201, |
| DECIMAL(20, 0))]), rowset=16 |
| access([t1.biz_day]), partitions(p0) |
| is_index_back=false, is_global_index=false, filter_before_indexback[false,false], |
| range_key([t1.__pk_increment]), range(MIN ; MAX)always true |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------+
12 rows in set (0.003 sec)- Query plan: After setting
non_standard_comparison_leveltoequal, the query optimizer converts thebiz_daycolumn from a string type to a decimal type before the comparison. - Filter condition:
filter([cast(t1.biz_day, DECIMAL(-1, -1)) >= cast(20200101, DECIMAL(20, 0))], [cast(t1.biz_day, DECIMAL(-1, -1)) <= cast(20200201, DECIMAL(20, 0))])indicates that thebiz_daycolumn is converted to a decimal type for range comparison. Sincebetweenis a range query, the equal value comparison rule forequaldoes not convert the constant type.
- Query plan: After setting
-
Set
non_standard_comparison_leveltorangeand view the execution plan.EXPLAIN SELECT /*+opt_param('non_standard_comparison_level', 'range')*/ * FROM t1 WHERE biz_day BETWEEN 20200101 AND 20200201;
+----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Query Plan |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| =============================================== |
| |ID|OPERATOR |NAME|EST.ROWS|EST.TIME(us)| |
| ----------------------------------------------- |
| |0 |TABLE FULL SCAN|t1 |1 |3 | |
| =============================================== |
| Outputs & filters: |
| ------------------------------------- |
| 0 - output([t1.biz_day]), filter([cast(t1.biz_day, DECIMAL(-1, -1)) >= cast(20200101, DECIMAL(20, 0))], [cast(t1.biz_day, DECIMAL(-1, -1)) <= cast(20200201, |
| DECIMAL(20, 0))]), rowset=16 |
| access([t1.biz_day]), partitions(p0) |
| is_index_back=false, is_global_index=false, filter_before_indexback[false,false], |
| range_key([t1.__pk_increment]), range(MIN ; MAX)always true |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------+
12 rows in set (0.002 sec)- Query plan: After setting
non_standard_comparison_leveltorange, the query optimizer converts integer constants to a string type before the comparison. - Filter condition:
filter([t1.biz_day >= demote_cast(20200101, VARCHAR(10))], [t1.biz_day <= demote_cast(20200201, VARCHAR(10))])indicates that integer constants are converted to a string type for range comparison. After non-standard comparison, the comparison is performed based on character order. For example,'202002010'belongs to the character order['20200101', '20200201'], but it does not fall within the numeric order range[20200101, 20200201].
- Query plan: After setting
Other related conversion rules
-
The conversion methods from strings to floating-point numbers and from integers to floating-point numbers may not be the same. The results may be affected by factors such as the compiler version. One way to avoid such issues is to use
CAST()to prevent the value from being implicitly converted to a floating-point number. -
When numeric or time values are implicitly converted to strings, the resulting value has a character set and collation determined by the
character_set_connectionandcollation_connectionsystem variables.This means that such conversions generate non-binary strings (
CHAR,VARCHAR, orLONGTEXTvalues). If the connection character set is set to binary, the conversion result is a binary string (BINARY,VARBINARY, orLONGBLOBvalue). -
The type conversion during integer expression evaluation is slightly different. For example, when an integer expression is used as part of a CREATE TABLE statement, the new table created will have a column of type
INTorBIGINTbased on the result type of the expression:CREATE TABLE t SELECT integer_expr;If the expression's maximum length is not suitable for
INT,BIGINTis used. However, you can force the use ofBIGINTinstead ofINTby using an expression with sufficient length:CREATE TABLE t SELECT 000000000000000000000;
Query OK, 1 row affected (0.145 sec)
Implicit data type conversion rules
seekdb automatically converts a value from one data type to another when the conversion makes sense.
The following table shows the implicit conversion matrix for all data types. It does not consider the direction of conversion or the context in which the conversion occurs.
| Data type | BOOL | INT | SMALLINT | MEDIUMINT | BIGINT | SERIAL | DECIMAL | NUMERIC | FLOAT | DOUBLE | BIT | DATETIME | TIMESTAMP | DATE | TIME |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| YEAR | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| CHAR | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| VARCHAR | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| BINARY | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| VARBINARY | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| TEXT | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| TINYBLOB | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| MEDIUMBLOB | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| LONGBLOB | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| TINYTEXT | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| MEDIUMTEXT | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| LONGTEXT | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| ENUM | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| SET | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| BLOB | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
Implicit data type conversion examples
SELECT CAST(BOOL_COLUMN AS YEAR) FROM YOUR_TABLE;
+---------------------------+
| CAST(BOOL_COLUMN AS YEAR) |
+---------------------------+
| 2001 |
+---------------------------+
2 rows in set (0.001 sec)