elasticsearch同步mysql数据库神器之go-mysql-elasticsearch

白禄

2023-12-01

go-mysql-elasticsearch 是国内作者开发的一款插件。测试表明：该插件优点：能实现同步增、删、改、查操作。不足之处（待完善的地方）：
1、日志不是很详细，但是能满足基本需求；
2、初始化时，无法自动同步mysql中存在的以前的数据，需要自行解决初始导入（如重建索引批量导入）

go-mysql-elasticsearch 安装
步骤1：安装go
yum install go
步骤2：安装godep
go get github.com/tools/godep
步骤3：获取go-mysql-elastisearch插件
go get github.com/siddontang/go-mysql-elasticsearch
步骤4：安装go-mysql-elastisearch插件
cd $GOPATH/src/github.com/siddontang/go-mysql-elasticsearch
make
go-mysql-elasticsearch 使用
1 修改配置文件 vi river.toml

MySQL address, user and password

user must have replication privilege in MySQL.

#以下为同步的mysql配置
my_addr = “127.0.0.1:3306”
my_user = “root”
my_pass = “123456”
my_charset = “utf8”

Set true when elasticsearch use https

#es_https = false

Elasticsearch address

es_addr = “192.168.100.90:9200”

Elasticsearch user and password, maybe set by shield, nginx, or x-pack

es_user = “”
es_pass = “”

Path to store data, like master.info, if not set or empty,

we must use this to support breakpoint resume syncing.

TODO: support other storage, like etcd.

data_dir = “./var”

Inner Http status address

stat_addr = “127.0.0.1:12800”

pseudo server id like a slave

server_id = 1001

mysql or mariadb

flavor = “mysql”

mysqldump execution path

if not set or empty, ignore mysqldump.

#mysqldump = “mysqldump”

if we have no privilege to use mysqldump with --master-data,

we must skip it.

#skip_master_data = false

minimal items to be inserted in one bulk

bulk_size = 128

force flush the pending requests if we don’t have enough items >= bulk_size

flush_bulk_time = “200ms”

Ignore table without primary key

skip_no_pk_table = false

MySQL data source

[[source]]
schema = “zkbh_nbjd”

Only below tables will be synced into Elasticsearch.

“t_[0-9]{4}” is a wildcard table format, you can use it if you have many sub tables, like table_0000 - table_1023

I don’t think it is necessary to sync all tables in a database.

#同步的数据表列表，多个表用,隔开
tables = [“sys_user”,“sys_log”]

Below is for special rule mapping

Very simple example

desc t;

±------±-------------±-----±----±--------±------+

| Field | Type | Null | Key | Default | Extra |

±------±-------------±-----±----±--------±------+

| id | int(11) | NO | PRI | NULL | |

| name | varchar(256) | YES | | NULL | |

±------±-------------±-----±----±--------±------+

The table `t` will be synced to ES index `test` and type `t`

同步zkbh_nbjd数据库下的sys_user表数据到索引user中
[[rule]]
schema = “zkbh_nbjd”
table = “sys_user”
index = “user”
type = “novel”

Wildcard table rule, the wildcard table must be in source tables

All tables which match the wildcard format will be synced to ES index `test` and type `t`.

In this example, all tables must have same schema with above table `t`;

同步zkbh_nbjd数据库下的sys_log表数据到索引log中
[[rule]]
schema = “zkbh_nbjd”
table = “sys_log”
index = “log”
type = “log”

3.启动 go-mysql-elasticsearch

cd /root/go/src/github.com/siddontang/go-mysql-elasticsearch
nohup ./bin/go-mysql-elasticsearch -config=./etc/river.toml & 为后台启动，否则会因为登录linux的用户退出而关闭服务。此处需要引入Screen 窗口管理器来保证 go-mysql-elasticsearch服务不会关闭，具体请查看相关资料

elasticsearch同步mysql数据库神器之go-mysql-elasticsearch

MySQL address, user and password

user must have replication privilege in MySQL.

Set true when elasticsearch use https

Elasticsearch address

Elasticsearch user and password, maybe set by shield, nginx, or x-pack

Path to store data, like master.info, if not set or empty,

we must use this to support breakpoint resume syncing.

TODO: support other storage, like etcd.

Inner Http status address

pseudo server id like a slave

mysql or mariadb

mysqldump execution path

if not set or empty, ignore mysqldump.

if we have no privilege to use mysqldump with --master-data,

we must skip it.

minimal items to be inserted in one bulk

force flush the pending requests if we don’t have enough items >= bulk_size

Ignore table without primary key

MySQL data source

Only below tables will be synced into Elasticsearch.

“t_[0-9]{4}” is a wildcard table format, you can use it if you have many sub tables, like table_0000 - table_1023

I don’t think it is necessary to sync all tables in a database.

Below is for special rule mapping

Very simple example

desc t;

±------±-------------±-----±----±--------±------+

| Field | Type | Null | Key | Default | Extra |

±------±-------------±-----±----±--------±------+

| id | int(11) | NO | PRI | NULL | |

| name | varchar(256) | YES | | NULL | |

±------±-------------±-----±----±--------±------+

The table `t` will be synced to ES index `test` and type `t`

Wildcard table rule, the wildcard table must be in source tables

All tables which match the wildcard format will be synced to ES index `test` and type `t`.

In this example, all tables must have same schema with above table `t`;

相关阅读

相关文章

相关问答

相关文档

elasticsearch同步mysql数据库神器之go-mysql-elasticsearch

MySQL address, user and password

user must have replication privilege in MySQL.

Set true when elasticsearch use https

Elasticsearch address

Elasticsearch user and password, maybe set by shield, nginx, or x-pack

Path to store data, like master.info, if not set or empty,

we must use this to support breakpoint resume syncing.

TODO: support other storage, like etcd.

Inner Http status address

pseudo server id like a slave

mysql or mariadb

mysqldump execution path

if not set or empty, ignore mysqldump.

if we have no privilege to use mysqldump with --master-data,

we must skip it.

minimal items to be inserted in one bulk

force flush the pending requests if we don’t have enough items >= bulk_size

Ignore table without primary key

MySQL data source

Only below tables will be synced into Elasticsearch.

“t_[0-9]{4}” is a wildcard table format, you can use it if you have many sub tables, like table_0000 - table_1023

I don’t think it is necessary to sync all tables in a database.

Below is for special rule mapping

Very simple example

desc t;

±------±-------------±-----±----±--------±------+

| Field | Type | Null | Key | Default | Extra |

±------±-------------±-----±----±--------±------+

| id | int(11) | NO | PRI | NULL | |

| name | varchar(256) | YES | | NULL | |

±------±-------------±-----±----±--------±------+

The table t will be synced to ES index test and type t

Wildcard table rule, the wildcard table must be in source tables

All tables which match the wildcard format will be synced to ES index test and type t.

In this example, all tables must have same schema with above table t;

相关阅读

相关文章

相关问答

相关文档

The table `t` will be synced to ES index `test` and type `t`

All tables which match the wildcard format will be synced to ES index `test` and type `t`.

In this example, all tables must have same schema with above table `t`;