0
0
0
0
博客/.../

mysql > TiDB 数据迁移

 TiDBer_1bJAfgfv  发表于  2026-02-09
原创

mysql版本:Server version: 5.6.28

tidb版本:v7.1.1

dumpling版本:v7.4.0

lightning版本:v7.4.0

DM版本:v7.4.0

数据迁移之前的准备工作:

  1. 数据源端日志模式必须是 ROW 模式。
  2. 需要确保所有表有唯一键或者是主键。
  3. 确认目标TiDB与同步库清单,没有冲突库名,有冲突库名可改名

1)如何确认日志模式

mysql> show variables like '%binlog_format%'; 
+---------------+-------+ 
| Variable_name | Value | 
+---------------+-------+ 
| binlog_format | MIXED | 
+---------------+-------+ 
1 row in set (0.00 sec) 
--如上不为 ROW ,需要执行如下修改,修改之后,当前session不生效,需要重新登录。 
mysql> set global binlog_format=row; 
Query OK, 0 rows affected (0.00 sec) 
--重新登录,再次查看该参数,为 ROW ,此准备工作即为完成。
 mysql> show variables like '%binlog_format%'; 
+---------------+-------+ 
| Variable_name | Value |
+---------------+-------+ 
| binlog_format | ROW   | 
+---------------+-------+ 
1 row in set (0.05 sec)

2)如何确定数据表是否有主键或者唯一索引,若没有,可商讨后添加无业务意义主键自增ID。

--查看指定库中没有唯一键或主键的表 
--${db_list}="'db_name'" 
#注意替换sql中db_list 
select b.TABLE_SCHEMA,b.TABLE_NAME  from  (
 select TABLE_SCHEMA,TABLE_NAME from information_schema.columns where table_schema in 
  (${db_list})    
  and (COLUMN_KEY='PRI' or COLUMN_KEY='UNI') group by TABLE_SCHEMA,TABLE_NAME
 ) a  
right join information_schema.tables b 
on  a.TABLE_SCHEMA=b.TABLE_SCHEMA and a.TABLE_NAME=b.TABLE_NAME   
where b.table_schema in ${db_list}  and a.table_schema is null;

3)确定目标tidb没有冲突库名

--${db_list}="'db_name1','db_name2'" 
#注意替换sql中db_list 
--以下sql没有输出,即为没有冲突库名 
mysql> select * from INFORMATION_SCHEMA.SCHEMATA where SCHEMA_NAME in ${db_list};
 Empty set (0.00 sec)

数据迁移mysq > TiDB 

数据量大小 使用到的工具 迁移方式
<1TB DM 配置DM全量模式
>1TB dumpling,lightning,DM

1.dumpling 导出mysql全量历史

2.lightning 将历史数据导入TiDB

3.DM配置增量模式

工具权限描述:

  • dumpling : PROCESS,SELECT,RELOAD,LOCK TABLES,REPLICATION CLIENT
  • lightning(逻辑导入模式) :CREATE,SELECT,INSERT,UPDATE,DELETE,DROP,ALTER

在MySQL端创建传输数据用账号:

mysql> create user '${username}'@'10.%' identified by '${passwd}'; 
Query OK, 0 rows affected (0.53 sec)  
mysql> grant PROCESS,SELECT,RELOAD,LOCK TABLES,REPLICATION CLIENT on *.* to '${username}'@'10.%';
Query OK, 0 rows affected (0.00 sec)

(一)小于1TB数据迁移

使用DM做全量数据迁移,task-mode配置项有改动,其余与增量传输一致

1)配置数据源

tidb@testdb-pd:~/dm/source$ cat stg03_test.yaml
source-id: "stg03-test" #起个名字
 
from:
  host: "${mysql_ip}" 
  user: "${mysql_user}"      
  password: "${passwd}" # mysql_passwd,使用加密:tiup dmctl encrypt 'abc!@#123'
  port: ${mysql_port}

2)创建数据源

tidb@testdb-pd:~/dm/source$tiup dmctl --master-addr=127.0.0.1:8261 operate-source create stg03_test-task.yaml
#--master-addr=${dm-master_ip}:8261 指明dm的ip:port,这里因为我部署在本机就有一个dm-master节点,所以这么写

#查看dm集群名字
tiup dm list
#查看dm集群信息,Role为 dm-master的ID即为master-addr 需要的信息,集群中有多个master,哪一个都行。
tiup dm display ${dm_cluster_name}

3)配置DM任务

tidb@testdb-pd:~/dm/task$ cat stg03_test-task.yaml
name: stg03_test   #起个名字
task-mode: all   #“all”指定是全量传输
collation_compatible: "strict" #严格复制mysql数据源的排序规则及字符集,tidb默认字符串大小写敏感,而mysql默认字符串大小写不敏感
 
#目标tidb的相关信息
target-database:
  host: "127.0.0.1"      #目标tidb的IP
  port: 4000             #目标tidb的port
  user: "root" #目标tidb的用户
  password: "ddddddddddddd+EpWJaIlw==" # 密码,如果密码不为空,则推荐使用经过 dmctl 加密的密文
 
# 填写一个或多个所需同步的数据源信息
mysql-instances:
    source-id: "stg03-test"   #上文的数据源的source-id
    block-allow-list: "ba-rule1"  #同步数据的额过滤规则
#   route-rules: ["route-rule-1"]
 
block-allow-list:
  ba-rule1:
          do-dbs: ['db_name1','db_name2']   #要迁移哪些库
 
#迁移至非同名库
#routes:                        
#  route-rule-1:                
#    schema-pattern: "db_name"    
#    target-schema: "db_name_bak"

4)启动DM任务

tiup dmctl --master-addr=127.0.0.1:8261 start-task stg03_test-task.yaml

(二)大于1TB数据迁移

dumpling 数据导出,语法格式

tiup dumpling -u${username} -h${mysql_ip} -P${mysql_port} -p${passwd} -o ${store_dir} --filetype csv -B ${db_name1} -B ${db_name2}

  • -u 指定mysql 用户
  • -h 指定mysql IP
  • -P 指定mysql Port
  • -o 指定数据存储目录
  • -B 指定需要导出数据的db

lightling 数据导入,语法格式:

 tiup tidb-lightning -config tidb-lightning-test.toml

  tidb-lightning-test.toml文件配置如下:通常在对应环境拷贝历史配置文件后,只需要修改 data-source-dir  为上述 dumpling的 ${store_dir}

[lightning]
# 日志
level = "info"
file = "tidb-lightning.log"
 
[tikv-importer]
# 选择使用的导入模式,使用逻辑导入,物理导入速度快,但是不做校验。
backend = "tidb"
# 设置排序的键值对的临时存放地址,目标路径需要是一个空目录
sorted-kv-dir = "/tidb-data/lightning"
 
[mydumper]
# 源数据目录。
data-source-dir = "${store_dir}"
 
# 配置通配符规则,默认规则会过滤 mysql、sys、INFORMATION_SCHEMA、PERFORMANCE_SCHEMA、METRICS_SCHEMA、INSPECTION_SCHEMA 系统数据库下的所有表
# 若不配置该项,导入系统表时会出现“找不到 schema”的异常
filter = ['*.*', '!mysql.*', '!sys.*', '!INFORMATION_SCHEMA.*', '!PERFORMANCE_SCHEMA.*', '!METRICS_SCHEMA.*', '!INSPECTION_SCHEMA.*']
 
#no-schema=false
[tidb]
# 目标集群的信息
host = "${tidb_ip}"
port = 4000
user = "${tidb_user}"
password = "${tidb_passwd}"
# 表架构信息在从 TiDB 的“状态端口”获取。
status-port = 10080
# 集群 pd 的地址
pd-addr = "${pd_ip}:2379"
sql-mode='${sql_mode},NO_AUTO_VALUE_ON_ZERO'

DM进行增量数据传输:

DM-UI:http://${dm-master/L_ip}:8261/dashboard/migration/task

1)配置数据源:

tidb@testdb-pd:~/dm/source$ cat stg03_test.yaml
source-id: "stg03-test" #起个名字
 
from:
  host: "${mysql_ip}"        #mysql_ip
  user: "${mysql_user}"      #mysql_user
  password: "${passwd}" ## mysql_passwd使用加密:tiup dmctl encrypt 'abc!@#123'
  port: ${mysql_port}

2)创建DM数据源:

tiup dmctl --master-addr=${dm-master_ip}:8261 operate-source create stg03_test-task.yaml

3)配置DM任务:

#cat stg03_test-task.yaml  
name: stg03_test   #起个名字
task-mode: incremental   #“incremental”指定是增量传输
collation_compatible: "strict" #严格复制mysql数据源的排序规则及字符集,tidb默认字符串大小写敏感,而mysql默认字符串大小写不敏感
 
#目标tidb的相关信息
target-database:
  host: "${tidb_ip}"      #目标tidb的IP
  port: 4000             #目标tidb的port
  user: "${tidb_user}" #目标tidb的用户
  password: "${passwd}" # 密码,如果密码不为空,则推荐使用经过 dmctl 加密的密文
 
# 填写一个或多个所需同步的数据源信息,多的数据源配置可参考官方文档
mysql-instances:
    source-id: "stg03-test"   #上文的数据源的source-id
    block-allow-list: "ba-rule1"  #同步数据的额过滤规则
#   route-rules: ["route-rule-1"]
 
block-allow-list:
  ba-rule1:
          do-dbs: ['db_name1','db_name2']   #要迁移哪些库
 
#迁移至非同名库
#routes:                        
#  route-rule-1:                
#    schema-pattern: "db_name"    
#    target-schema: "db_name_bak"

4)启动增量任务:

tiup dmctl --master-addr=${dm-master_ip}:8261 start-task stg03_test-task.yaml 

5)停止增量任务:

tiup dmctl --master-addr=${dm-master_ip}:8261 stop-task -s stg03-test stg03_test-task

6)删除DM任务

tiup dmctl   --master-addr 127.0.0.1:8261 stop-task -s stg03_test stg03-test-task

7)删除DM数据源

tiup dmctl   --master-addr 127.0.0.1:8261 operate-source stop stg03_test

dumpling和lightning示例如下

--dumpling 全量数据导出
tidb@testdb-pd:~$ tiup dumpling -u${mysql_user} -h${mysql_ip} -P${mysql_port} -p${passwd} -o /home/tidb/${store_dir} --filetype csv -B db_name1 -B db_name2 
tiup is checking updates for component dumpling ...
A new version of dumpling is available:
   The latest version:         v8.5.1
   Local installed version:    v7.4.0
   Update current component:   tiup update dumpling
   Update all components:      tiup update --all
 
Starting component `dumpling`: /home/tidb/.tiup/components/dumpling/v7.4.0/dumpling -u** -h** -P** -p** -o /home/tidb/** --filetype csv -B **
Release version: v7.4.0
Git commit hash: 38cb4f3312be9199a983c0ef282d2ea2e28a7824
Git branch:      heads/refs/tags/v7.4.0
Build timestamp: 2023-10-10 02:10:37Z
Go version:      go version go1.21.1 linux/amd64
 
[2025/03/26 10:57:09.087 +08:00] [INFO] [versions.go:54] ["Welcome to dumpling"] ["Release Version"=v7.4.0] ["Git Commit Hash"=38cb4f3312be9199a983c0ef282d2ea2e28a7824] ["Git Branch"=heads/refs/tags/v7.4.0] ["Build timestamp"="2023-10-10 02:10:37"] ["Go Version"="go version go1.21.1 linux/amd64"]
[2025/03/26 10:57:09.092 +08:00] [WARN] [version.go:327] ["select tidb_version() failed, will fallback to 'select version();'"] [error="Error 1305 (42000): FUNCTION tidb_version does not exist"]
[2025/03/26 10:57:09.093 +08:00] [INFO] [version.go:434] ["detect server version"] [type=MySQL] [version=5.6.28-log]
[2025/03/26 10:57:09.112 +08:00] [INFO] [dump.go:151] ["begin to run Dump"] [conf="{\"s3\":{\"endpoint\":\"\",\"region\":\"\",\"storage-class\":\"\",\"sse\":\"\",\"sse-kms-key-id\":\"\",\"acl\":\"\",\"access-key\":\"\",\"secret-access-key\":\"\",\"session-token\":\"\",\"provider\":\"\",\"force-path-style\":true,\"use-accelerate-endpoint\":false,\"role-arn\":\"\",\"external-id\":\"\",\"object-lock-enabled\":false},\"gcs\":{\"endpoint\":\"\",\"storage-class\":\"\",\"predefined-acl\":\"\",\"credentials-file\":\"\"},\"azblob\":{\"endpoint\":\"\",\"account-name\":\"\",\"account-key\":\"\",\"access-tier\":\"\",\"sas-token\":\"\",\"encryption-scope\":\"\",\"encryption-key\":\"\"},\"SpecifiedTables\":false,\"AllowCleartextPasswords\":false,\"SortByPk\":true,\"NoViews\":true,\"NoSequences\":true,\"NoHeader\":false,\"NoSchemas\":false,\"NoData\":false,\"CompleteInsert\":false,\"TransactionalConsistency\":true,\"EscapeBackslash\":true,\"DumpEmptyDatabase\":true,\"PosAfterConnect\":false,\"CompressType\":0,\"Host\":\"10.22.160.152\",\"Port\":3311,\"Threads\":4,\"User\":\"ruwei.hu\",\"Security\":{\"CAPath\":\"\",\"CertPath\":\"\",\"KeyPath\":\"\"},\"LogLevel\":\"info\",\"LogFile\":\"\",\"LogFormat\":\"text\",\"OutputDirPath\":\"/home/tidb/gw\",\"StatusAddr\":\":8281\",\"Snapshot\":\"\",\"Consistency\":\"flush\",\"CsvNullValue\":\"\\\\N\",\"SQL\":\"\",\"CsvSeparator\":\",\",\"CsvDelimiter\":\"\\\"\",\"CsvLineTerminator\":\"\\r\\n\",\"Databases\":[**],\"Where\":\"\",\"FileType\":\"csv\",\"ServerInfo\":{\"ServerType\":1,\"ServerVersion\":\"5.6.28-log\",\"HasTiKV\":false},\"Rows\":0,\"ReadTimeout\":900000000000,\"TiDBMemQuotaQuery\":0,\"FileSize\":0,\"StatementSize\":1000000,\"SessionParams\":{},\"Tables\":{},\"CollationCompatible\":\"loose\",\"IOTotalBytes\":null,\"Net\":\"\"}"]
[2025/03/26 10:57:12.681 +08:00] [INFO] [dump.go:275] ["All the dumping transactions have started. Start to unlock tables"]
[2025/03/26 10:57:12.906 +08:00] [INFO] [writer.go:272] ["no data written in table chunk"] [database=**] [table=**] [chunkIdx=0]
......
[2025/03/26 10:59:12.682 +08:00] [INFO] [status.go:37] [progress] [tables="321/323 (99.4%)"] ["finished rows"=21882111] ["estimate total rows"=24415076] ["finished size"=10.03GB] ["average speed(MiB/s)"=79.68125748901743] ["recent speed bps"=81121498.59944943] ["chunks progress"="99.38 %"]
[2025/03/26 10:59:34.803 +08:00] [INFO] [collector.go:264] ["backup success summary"] [total-ranges=652] [ranges-succeed=652] [ranges-failed=0] [total-take=2m22.121161546s] [total-kv-size=12.01GB] [average-speed=84.51MB/s] [total-rows=22966831]
[2025/03/26 10:59:34.846 +08:00] [INFO] [main.go:82] ["dump data successfully, dumpling will exit now"]
--使用lightning进行全量数据导入
tidb@testdb-pd:~/backup$ cat tidb-lightning-test.toml
[lightning]
# 日志
level = "info"
file = "tidb-lightning.log"
 
[tikv-importer]
# 选择使用的导入模式
backend = "tidb"
# 设置排序的键值对的临时存放地址,目标路径需要是一个空目录
sorted-kv-dir = "/tidb-data/lightning"
 
[mydumper]
# 源数据目录。
data-source-dir = "/home/tidb/${store_dir}"
 
# 配置通配符规则,默认规则会过滤 mysql、sys、INFORMATION_SCHEMA、PERFORMANCE_SCHEMA、METRICS_SCHEMA、INSPECTION_SCHEMA 系统数据库下的所有表
# 若不配置该项,导入系统表时会出现“找不到 schema”的异常
filter = ['*.*', '!mysql.*', '!sys.*', '!INFORMATION_SCHEMA.*', '!PERFORMANCE_SCHEMA.*', '!METRICS_SCHEMA.*', '!INSPECTION_SCHEMA.*']
 
#no-schema=false
[tidb]
# 目标集群的信息
host = "${tidb_ip}"
port = ${tidb_port}
user = "${tidb_user}"
password = "${passwd}"
# 表架构信息在从 TiDB 的“状态端口”获取。
status-port = 10080
# 集群 pd 的地址
pd-addr = "${pd_ip}:2379"
tidb@testdb-pd:~/backup$ time tiup tidb-lightning -config tidb-lightning-test.toml
tiup is checking updates for component tidb-lightning ...timeout(2s)!
Starting component `tidb-lightning`: /home/tidb/.tiup/components/tidb-lightning/v7.4.0/tidb-lightning -config tidb-lightning-test.toml
Verbose debug logs will be written to tidb-lightning.log
 
+---+----------------------------------------------+-------------+--------+
| # | CHECK ITEM                                   | TYPE        | PASSED |
+---+----------------------------------------------+-------------+--------+
| 1 | Source csv files size is proper              | performance | true   |
+---+----------------------------------------------+-------------+--------+
| 2 | the checkpoints are valid                    | critical    | true   |
+---+----------------------------------------------+-------------+--------+
| 3 | Cluster version check passed                 | critical    | true   |
+---+----------------------------------------------+-------------+--------+
| 4 | Lightning has the correct storage permission | critical    | true   |
+---+----------------------------------------------+-------------+--------+
 
tidb lightning exit successfully
 
real    11m14.471s
user    1m32.519s
sys     0m52.599s

关于各个迁移工具,详情请见:

Dumpling:https://docs.pingcap.com/zh/tidb/stable/dumpling-overview/

Lightning:https://docs.pingcap.com/zh/tidb/stable/tidb-lightning-overview/

DM:https://docs-archive.pingcap.com/zh/tidb-data-migration/v2.0/

0
0
0
0

版权声明:本文为 TiDB 社区用户原创文章,遵循 CC BY-NC-SA 4.0 版权协议,转载请附上原文出处链接和本声明。

评论
暂无评论