问题背景:
通过granfan-->pd-->region health 图表监控到empty region 较多,且此集群目前业务并不多,数据量也很小,初始化时只创建了较表。
问题排查:
查看region merge相关参数:
l max-merge-region-size:
##控制region merge的大小上限,当region大于指定值时pd不会将相邻的region合并(默认20)
l max-merge-region-keys:
##控制region merge的key上限,当region大于指定值时pd不会将相邻的region合并(默认2000)
l merge-schedule-limit:
##同时进行region merge调度的任务数,设置为则关闭region merge(默认值8)
如下,这些参数都是默认值,
$ pd-ctl config show max-merge-region-size |grep max-merge-region-size "max-merge-region-size": 20, $ pd-ctl config show max-merge-region-keys |grep max-merge-region-keys "max-merge-region-keys": 200000, $ pd-ctl config show merge-schedule-limit |grep merge-schedule-limit "merge-schedule-limit": 8, |
查看region的split和merge频率相关参数,
l patrol-region-interval:
##控制replicaChecker检查region监控状态的运行频率,越短则运行越快(10ms)
l split-merge-interval:
##控制对同一个region做split和merge的操作间隔,对应新split的region一段时间不会merge(默认1h)
$ pd-ctl config show patrol-region-interval |grep patrol "patrol-region-interval": "100ms", $ pd-ctl config show split-merge-interval |grep split "split-merge-interval": "1h0m0s", |
测试ddl时与分配的region是否会merge:
突然想到,空regions多的集群在ddl时使用了region预分配参数,查看相关表时只有1个region,在社区咨询老师说是被merge了。
创建测试表并预分配region,查看region数是指定的:
mysql> create table guo004(id int, name varchar(10)) SHARD_ROW_ID_BITS=4 PRE_SPLIT_REGIONS=4; Query OK, 0 rows affected (0.51 sec) mysql> show table guo004 regions; +-----------+-----------------------------+-----------------------------+-----------+-----------------+---------------------+------------+---------------+------------+----------------------+------------------+ | REGION_ID | START_KEY | END_KEY | LEADER_ID | LEADER_STORE_ID | PEERS | SCATTERING | WRITTEN_BYTES | READ_BYTES | APPROXIMATE_SIZE(MB) | APPROXIMATE_KEYS | +-----------+-----------------------------+-----------------------------+-----------+-----------------+---------------------+------------+---------------+------------+----------------------+------------------+ | 25373 | t_437_ | t_437_r_576460752303423488 | 25375 | 5 | 25374, 25375, 25376 | 0 | 341 | 0 | 1 | 0 | | 25377 | t_437_r_576460752303423488 | t_437_r_1152921504606846976 | 25379 | 5 | 25378, 25379, 25380 | 0 | 29 | 0 | 1 | 0 | | 25381 | t_437_r_1152921504606846976 | t_437_r_1729382256910270464 | 25383 | 5 | 25382, 25383, 25384 | 0 | 0 | 0 | 1 | 0 | | 25385 | t_437_r_1729382256910270464 | t_437_r_2305843009213693952 | 25387 | 5 | 25386, 25387, 25388 | 0 | 0 | 0 | 1 | 0 | | 25389 | t_437_r_2305843009213693952 | t_437_r_2882303761517117440 | 25391 | 5 | 25390, 25391, 25392 | 0 | 29 | 0 | 1 | 0 | | 25393 | t_437_r_2882303761517117440 | t_437_r_3458764513820540928 | 25395 | 5 | 25394, 25395, 25396 | 0 | 0 | 0 | 1 | 0 | | 25397 | t_437_r_3458764513820540928 | t_437_r_4035225266123964416 | 25399 | 5 | 25398, 25399, 25400 | 0 | 29 | 0 | 1 | 0 | | 25401 | t_437_r_4035225266123964416 | t_437_r_4611686018427387904 | 25403 | 5 | 25402, 25403, 25404 | 0 | 0 | 0 | 1 | 0 | | 25405 | t_437_r_4611686018427387904 | t_437_r_5188146770730811392 | 25407 | 5 | 25406, 25407, 25408 | 0 | 29 | 0 | 1 | 0 | | 25409 | t_437_r_5188146770730811392 | t_437_r_5764607523034234880 | 25411 | 5 | 25410, 25411, 25412 | 0 | 0 | 0 | 1 | 0 | | 25413 | t_437_r_5764607523034234880 | t_437_r_6341068275337658368 | 25415 | 5 | 25414, 25415, 25416 | 0 | 29 | 0 | 1 | 0 | | 25417 | t_437_r_6341068275337658368 | t_437_r_6917529027641081856 | 25419 | 5 | 25418, 25419, 25420 | 0 | 29 | 0 | 1 | 0 | | 25421 | t_437_r_6917529027641081856 | t_437_r_7493989779944505344 | 25423 | 5 | 25422, 25423, 25424 | 0 | 0 | 0 | 1 | 0 | | 25425 | t_437_r_7493989779944505344 | t_437_r_8070450532247928832 | 25427 | 5 | 25426, 25427, 25428 | 0 | 29 | 0 | 1 | 0 | | 25429 | t_437_r_8070450532247928832 | t_437_r_8646911284551352320 | 25431 | 5 | 25430, 25431, 25432 | 0 | 0 | 0 | 1 | 0 | | 25169 | t_437_r_8646911284551352320 | | 25171 | 5 | 25170, 25171, 25172 | 0 | 0 | 0 | 1 | 0 | +-----------+-----------------------------+-----------------------------+-----------+-----------------+---------------------+------------+---------------+------------+----------------------+------------------+ 16 rows in set (0.01 sec) |
检查预分配region是否merge
根据以上split-merge-interval参数,1小时后测试表的guo004空regions将被回收:
mysql> mysql> show table guo004 regions; +-----------+-----------+---------+-----------+-----------------+---------------------+------------+---------------+------------+----------------------+------------------+ | REGION_ID | START_KEY | END_KEY | LEADER_ID | LEADER_STORE_ID | PEERS | SCATTERING | WRITTEN_BYTES | READ_BYTES | APPROXIMATE_SIZE(MB) | APPROXIMATE_KEYS | +-----------+-----------+---------+-----------+-----------------+---------------------+------------+---------------+------------+----------------------+------------------+ | 25425 | t_437_ | | 25427 | 5 | 25426, 25427, 25428 | 0 | 1285 | 0 | 1 | 0 | +-----------+-----------+---------+-----------+-----------------+---------------------+------------+---------------+------------+----------------------+------------------+ 1 row in set (0.01 sec) |
即使是ddl预分配的region,如果没有regions为空的话,都会自动merge.
查看空regions 信息:
参数说明:
approximate_size:估算的region的数据量大小,单位为MB
approximate_keys:估算的region内的key的个数
注意关注approximate_size、approximate_keys 的值。
查看所有空region信息:
$ ./pd-ctl region check empty-region |more
$ ./pd-ctl region check empty-region |more { "count": 272, "regions": [ { "id": 202, "start_key": "7480000000000000FF5C00000000000000F8", "end_key": "7480000000000000FF5E00000000000000F8", "epoch": { "conf_ver": 71639, "version": 43 }, "peers": [ { "id": 7796112, "store_id": 21 }, { "id": 7952908, "store_id": 2 }, { "id": 7955094, "store_id": 19 } ], "leader": { "id": 7796112, "store_id": 21 }, "written_bytes": 0, "read_bytes": 0, "written_keys": 0, "read_keys": 0, "approximate_size": 1, "approximate_keys": 0 }, { "id": 217800, "start_key": "7480000000000002FF8500000000000000F8", "end_key": "7480000000000002FF8700000000000000F8", "epoch": { "conf_ver": 71369, "version": 290 }, "peers": [ { "id": 7932728, "store_id": 3 }, { "id": 7954710, "store_id": 1 }, { "id": 7955013, "store_id": 5 } ], "leader": { "id": 7932728, "store_id": 3 }, "written_bytes": 0, "read_bytes": 0, "written_keys": 0, "read_keys": 0, "approximate_size": 1, "approximate_keys": 0 }, --More-- |
查看单个region信息:
$ ./pd-ctl region 217800
$ ./pd-ctl region 217800 { "id": 217800, "start_key": "7480000000000002FF8500000000000000F8", "end_key": "7480000000000002FF8700000000000000F8", "epoch": { "conf_ver": 71369, "version": 290 }, "peers": [ { "id": 7932728, "store_id": 3 }, { "id": 7954710, "store_id": 1 }, { "id": 7955013, "store_id": 5 } ], "leader": { "id": 7932728, "store_id": 3 }, "written_bytes": 0, "read_bytes": 0, "written_keys": 0, "read_keys": 0, "approximate_size": 1, "approximate_keys": 0 } |
或者通过系统表TIKV_REGION_STATUS查看:
(我个人更倾向于查系统表,比对的可视度更强些):
mysql> select REGION_ID,START_KEY,END_KEY,TABLE_ID,DB_NAME,TABLE_NAME,APPROXIMATE_SIZE,APPROXIMATE_KEYS from TIKV_REGION_STATUS where DB_NAME='dzdz' order by START_KEY limit 30; +-----------+--------------------------------------+--------------------------------------+----------+---------+--------------------------+------------------+------------------+ | REGION_ID | START_KEY | END_KEY | TABLE_ID | DB_NAME | TABLE_NAME | APPROXIMATE_SIZE | APPROXIMATE_KEYS | +-----------+--------------------------------------+--------------------------------------+----------+---------+--------------------------+------------------+------------------+ | 217152 | 7480000000000001FFAD00000000000000F8 | 7480000000000001FFAF00000000000000F8 | 429 | dzdz | qrtz_blob_triggers | 1 | 0 | | 217152 | 7480000000000001FFAD00000000000000F8 | 7480000000000001FFAF00000000000000F8 | 429 | dzdz | qrtz_blob_triggers | 1 | 0 | | 217152 | 7480000000000001FFAD00000000000000F8 | 7480000000000001FFAF00000000000000F8 | 429 | dzdz | qrtz_blob_triggers | 1 | 0 | | 217156 | 7480000000000001FFAF00000000000000F8 | 7480000000000001FFB100000000000000F8 | 431 | dzdz | qrtz_calendars | 1 | 0 | | 217156 | 7480000000000001FFAF00000000000000F8 | 7480000000000001FFB100000000000000F8 | 431 | dzdz | qrtz_calendars | 1 | 0 | | 217160 | 7480000000000001FFB100000000000000F8 | 7480000000000001FFB300000000000000F8 | 433 | dzdz | qrtz_cron_triggers | 1 | 0 | | 217160 | 7480000000000001FFB100000000000000F8 | 7480000000000001FFB300000000000000F8 | 433 | dzdz | qrtz_cron_triggers | 1 | 0 | | 217164 | 7480000000000001FFB300000000000000F8 | 7480000000000001FFB500000000000000F8 | 435 | dzdz | qrtz_fired_triggers | 1 | 0 | | 217164 | 7480000000000001FFB300000000000000F8 | 7480000000000001FFB500000000000000F8 | 435 | dzdz | qrtz_fired_triggers | 1 | 0 | | 217164 | 7480000000000001FFB300000000000000F8 | 7480000000000001FFB500000000000000F8 | 435 | dzdz | qrtz_fired_triggers | 1 | 0 | | 217164 | 7480000000000001FFB300000000000000F8 | 7480000000000001FFB500000000000000F8 | 435 | dzdz | qrtz_fired_triggers | 1 | 0 | | 217164 | 7480000000000001FFB300000000000000F8 | 7480000000000001FFB500000000000000F8 | 435 | dzdz | qrtz_fired_triggers | 1 | 0 | | 217164 | 7480000000000001FFB300000000000000F8 | 7480000000000001FFB500000000000000F8 | 435 | dzdz | qrtz_fired_triggers | 1 | 0 | | 217164 | 7480000000000001FFB300000000000000F8 | 7480000000000001FFB500000000000000F8 | 435 | dzdz | qrtz_fired_triggers | 1 | 0 | | 217164 | 7480000000000001FFB300000000000000F8 | 7480000000000001FFB500000000000000F8 | 435 | dzdz | qrtz_fired_triggers | 1 | 0 | | 217168 | 7480000000000001FFB500000000000000F8 | 7480000000000001FFB700000000000000F8 | 437 | dzdz | qrtz_job_details | 1 | 0 | | 217168 | 7480000000000001FFB500000000000000F8 | 7480000000000001FFB700000000000000F8 | 437 | dzdz | qrtz_job_details | 1 | 0 | | 217168 | 7480000000000001FFB500000000000000F8 | 7480000000000001FFB700000000000000F8 | 437 | dzdz | qrtz_job_details | 1 | 0 | | 217168 | 7480000000000001FFB500000000000000F8 | 7480000000000001FFB700000000000000F8 | 437 | dzdz | qrtz_job_details | 1 | 0 | | 217172 | 7480000000000001FFB700000000000000F8 | 7480000000000001FFB900000000000000F8 | 439 | dzdz | qrtz_locks | 1 | 0 | | 217172 | 7480000000000001FFB700000000000000F8 | 7480000000000001FFB900000000000000F8 | 439 | dzdz | qrtz_locks | 1 | 0 | | 217176 | 7480000000000001FFB900000000000000F8 | 7480000000000001FFBB00000000000000F8 | 441 | dzdz | qrtz_paused_trigger_grps | 1 | 0 | | 217176 | 7480000000000001FFB900000000000000F8 | 7480000000000001FFBB00000000000000F8 | 441 | dzdz | qrtz_paused_trigger_grps | 1 | 0 | | 217180 | 7480000000000001FFBB00000000000000F8 | 7480000000000001FFBD00000000000000F8 | 443 | dzdz | qrtz_scheduler_state | 10 | 20343 | | 217180 | 7480000000000001FFBB00000000000000F8 | 7480000000000001FFBD00000000000000F8 | 443 | dzdz | qrtz_scheduler_state | 10 | 20343 | | 217184 | 7480000000000001FFBD00000000000000F8 | 7480000000000001FFBF00000000000000F8 | 445 | dzdz | qrtz_simple_triggers | 1 | 0 | | 217184 | 7480000000000001FFBD00000000000000F8 | 7480000000000001FFBF00000000000000F8 | 445 | dzdz | qrtz_simple_triggers | 1 | 0 | | 217188 | 7480000000000001FFBF00000000000000F8 | 7480000000000001FFC100000000000000F8 | 447 | dzdz | qrtz_simprop_triggers | 1 | 0 | | 217188 | 7480000000000001FFBF00000000000000F8 | 7480000000000001FFC100000000000000F8 | 447 | dzdz | qrtz_simprop_triggers | 1 | 0 | | 217192 | 7480000000000001FFC100000000000000F8 | 7480000000000001FFC300000000000000F8 | 449 | dzdz | qrtz_triggers | 1 | 0 | +-----------+--------------------------------------+--------------------------------------+----------+---------+--------------------------+------------------+------------------+ 30 rows in set (0.01 sec) mysql> |
通过以上信息显示一个regions的END_KEY 是另一个region的START_KEY, 结合APPROXIMATE_SIZE和APPROXIMATE_KEYS字段的值,可知部分空region是连续的。
检查跨表合并是否启用
l enable-cross-table-merge:
##设置是否开启跨表merge(默认v4.0:false,v6.0:true)
$ ./pd-ctl config show enable-cross-table-merge |grep cross "enable-cross-table-merge": "false", $ 如上跨表合并未启用 |
测试启用跨表合并是否能降低空region数:
检查当前空region数:
测试环境的empty region 也比较多,如下图:
启用跨表合并:
确认夸表合并未启用: » config show { "replication": { "enable-placement-rules": "true", "location-labels": "dc,host", "max-replicas": 3, "strictly-match-label": "false" }, "schedule": { "enable-cross-table-merge": "false", ...... } } 启用跨表合并: » config set enable-cross-table-merge true Success!
确认跨表合并启用: » config show { "replication": { "enable-placement-rules": "true", "location-labels": "dc,host", "max-replicas": 3, "strictly-match-label": "false" }, "schedule": { "enable-cross-table-merge": "true", ...... } } » |
启用跨表何必后空region数:
在grafana上看region health状态,empty region 数量已经为0:
问题处理:
启用跨表合并,可以减少或至0空region数据量
操作命令:$ ./pd-ctl config set enable-cross-table-merge true 即可。
综上所述:
通过granfan-->pd-->region health 图表监控到empty region 较多:
查看region merge大小相关参数:
l max-merge-region-size:
##控制region merge的大小上限,当region大于指定值时pd不会将相邻的region合并(默认20)
l max-merge-region-keys:
##控制region merge的key上限,当region大于指定值时pd不会将相邻的region合并(默认2000)
l merge-schedule-limit:
##同时进行region merge调度的任务数,设置为则关闭region merge(默认值8)
如果以上参数设置太小或不合理可以适当调整让region加速merge
如果以上参数设置合理(一般默认值就可以),请检查region merge频率参数
查看region的split和merge频率相关参数,
l patrol-region-interval:
##控制replicaChecker检查region监控状态的运行频率,越短则运行越快(10ms)
l split-merge-interval:
##控制对同一个region做split和merge的操作间隔,对应新split的region一段时间不会merge(默认1h)
如果以上参数设置过大或不合理可适当调小,但也不建议过小。
如果设置合理(一般默认值就可以),请检查跨表合并参数。
检查跨表合并是否启用
l enable-cross-table-merge:
##设置是否开启跨表merge(默认v4.0:false,v5.0及其后版本:true)
跨表合并参数v5.0及其后版本默认已经启用,使用v5.0及其后版本的应该不太会出现这个问题。
使用v4.0、小于v5.0的版本或enable-cross-table-merge 未启用情况下可能会出现empty region较多。
一般多DDL on table情况下, enable-cross-table-merge结果较为明显。