MySQL Table Engine
The MySQL engine allows you to perform SELECT
and INSERT
queries on data that is stored on a remote MySQL server.
Creating a Table
CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster]
(
name1 [type1] [DEFAULT|MATERIALIZED|ALIAS expr1] [TTL expr1],
name2 [type2] [DEFAULT|MATERIALIZED|ALIAS expr2] [TTL expr2],
...
) ENGINE = MySQL({host:port, database, table, user, password[, replace_query, on_duplicate_clause] | named_collection[, option=value [,..]]})
SETTINGS
[ connection_pool_size=16, ]
[ connection_max_tries=3, ]
[ connection_wait_timeout=5, ]
[ connection_auto_close=true, ]
[ connect_timeout=10, ]
[ read_write_timeout=300 ]
;
See a detailed description of the CREATE TABLE query.
The table structure can differ from the original MySQL table structure:
- Column names should be the same as in the original MySQL table, but you can use just some of these columns and in any order.
- Column types may differ from those in the original MySQL table. ClickHouse tries to cast values to the ClickHouse data types.
- The external_table_functions_use_nulls setting defines how to handle Nullable columns. Default value: 1. If 0, the table function does not make Nullable columns and inserts default values instead of nulls. This is also applicable for NULL values inside arrays.
The MySQL Table Engine is currently not available on the ClickHouse builds for MacOS (issue)
Engine Parameters
host:port
— MySQL server address.database
— Remote database name.table
— Remote table name.user
— MySQL user.password
— User password.replace_query
— Flag that convertsINSERT INTO
queries toREPLACE INTO
. Ifreplace_query=1
, the query is substituted.on_duplicate_clause
— TheON DUPLICATE KEY on_duplicate_clause
expression that is added to theINSERT
query. Example:INSERT INTO t (c1,c2) VALUES ('a', 2) ON DUPLICATE KEY UPDATE c2 = c2 + 1
, whereon_duplicate_clause
isUPDATE c2 = c2 + 1
. See the MySQL documentation to find whichon_duplicate_clause
you can use with theON DUPLICATE KEY
clause. To specifyon_duplicate_clause
you need to pass0
to thereplace_query
parameter. If you simultaneously passreplace_query = 1
andon_duplicate_clause
, ClickHouse generates an exception.
Arguments also can be passed using named collections. In this case host
and port
should be specified separately. This approach is recommended for production environment.
Simple WHERE
clauses such as =, !=, >, >=, <, <=
are executed on the MySQL server.
The rest of the conditions and the LIMIT
sampling constraint are executed in ClickHouse only after the query to MySQL finishes.
Supports multiple replicas that must be listed by |
. For example:
CREATE TABLE test_replicas (id UInt32, name String, age UInt32, money UInt32) ENGINE = MySQL(`mysql{2|3|4}:3306`, 'clickhouse', 'test_replicas', 'root', 'clickhouse');
Usage Example
Create table in MySQL:
mysql> CREATE TABLE `test`.`test` (
-> `int_id` INT NOT NULL AUTO_INCREMENT,
-> `int_nullable` INT NULL DEFAULT NULL,
-> `float` FLOAT NOT NULL,
-> `float_nullable` FLOAT NULL DEFAULT NULL,
-> PRIMARY KEY (`int_id`));
Query OK, 0 rows affected (0,09 sec)
mysql> insert into test (`int_id`, `float`) VALUES (1,2);
Query OK, 1 row affected (0,00 sec)
mysql> select * from test;
+------+----------+-----+----------+
| int_id | int_nullable | float | float_nullable |
+------+----------+-----+----------+
| 1 | NULL | 2 | NULL |
+------+----------+-----+----------+
1 row in set (0,00 sec)
Create table in ClickHouse using plain arguments:
CREATE TABLE mysql_table
(
`float_nullable` Nullable(Float32),
`int_id` Int32
)
ENGINE = MySQL('localhost:3306', 'test', 'test', 'bayonet', '123')
Or using named collections:
CREATE NAMED COLLECTION creds AS
host = 'localhost',
port = 3306,
database = 'test',
user = 'bayonet',
password = '123';
CREATE TABLE mysql_table
(
`float_nullable` Nullable(Float32),
`int_id` Int32
)
ENGINE = MySQL(creds, table='test')
Retrieving data from MySQL table:
SELECT * FROM mysql_table
┌─float_nullable─┬─int_id─┐
│ ᴺᵁᴸᴸ │ 1 │
└────────────────┴────────┘
Settings
Default settings are not very efficient, since they do not even reuse connections. These settings allow you to increase the number of queries run by the server per second.
connection_auto_close
Allows to automatically close the connection after query execution, i.e. disable connection reuse.
Possible values:
- 1 — Auto-close connection is allowed, so the connection reuse is disabled
- 0 — Auto-close connection is not allowed, so the connection reuse is enabled
Default value: 1
.
connection_max_tries
Sets the number of retries for pool with failover.
Possible values:
- Positive integer.
- 0 — There are no retries for pool with failover.
Default value: 3
.
connection_pool_size
Size of connection pool (if all connections are in use, the query will wait until some connection will be freed).
Possible values:
- Positive integer.
Default value: 16
.
connection_wait_timeout
Timeout (in seconds) for waiting for free connection (in case of there is already connection_pool_size active connections), 0 - do not wait.
Possible values:
- Positive integer.
Default value: 5
.
connect_timeout
Connect timeout (in seconds).
Possible values:
- Positive integer.
Default value: 10
.
read_write_timeout
Read/write timeout (in seconds).
Possible values:
- Positive integer.
Default value: 300
.