【是否原创】是
【首发渠道】TiDB 社区
问题背景
遇到下面这条慢 sql,查询计划走了高效的唯一索引,满足条件的行数只有 10 行,但是耗时却达到了 10s。
mysql> select name, uid, size, etag from objects where bucket_id = '.bucket.meta.cmu-bucket01' and name like 'dir_abc4999/%' order by name asc limit 501;
+----------------------------------+--------+-------+----------------------------------+
| name | uid | size | etag |
+----------------------------------+--------+-------+----------------------------------+
| dir_abc4999/obj_20211027_1_0.txt | test01 | 61441 | 79702d022dafea0173dc9bcbce1b562e |
| dir_abc4999/obj_20211027_1_1.txt | test01 | 61441 | 79702d022dafea0173dc9bcbce1b562e |
| dir_abc4999/obj_20211027_1_2.txt | test01 | 61441 | 79702d022dafea0173dc9bcbce1b562e |
| dir_abc4999/obj_20211027_1_3.txt | test01 | 61441 | 79702d022dafea0173dc9bcbce1b562e |
| dir_abc4999/obj_20211027_1_4.txt | test01 | 61441 | 79702d022dafea0173dc9bcbce1b562e |
| dir_abc4999/obj_20211027_1_5.txt | test01 | 61441 | 79702d022dafea0173dc9bcbce1b562e |
| dir_abc4999/obj_20211027_1_6.txt | test01 | 61441 | 79702d022dafea0173dc9bcbce1b562e |
| dir_abc4999/obj_20211027_1_7.txt | test01 | 61441 | 79702d022dafea0173dc9bcbce1b562e |
| dir_abc4999/obj_20211027_1_8.txt | test01 | 61441 | 79702d022dafea0173dc9bcbce1b562e |
| dir_abc4999/obj_20211027_1_9.txt | test01 | 61441 | 79702d022dafea0173dc9bcbce1b562e |
+----------------------------------+--------+-------+----------------------------------+
10 rows in set (10.08 sec)
表有一个唯一性的联合索引,可以完好地覆盖查询条件列 bucket_id 和 name,从表结构和 sql 语句第一印象是这条 sql 应该是非常快的,但是却出现了上面的结果。
mysql> show create table objects\"G
*************************** 1. row ***************************
Table: objects
Create Table: CREATE TABLE `objects` (
`id` bigint(20) NOT NULL /*T![auto_rand] AUTO_RANDOM(5) */,
`bucket_id` varchar(192) NOT NULL,
`name` varchar(512) NOT NULL,
`version_id` varchar(64) NOT NULL,
...
...
PRIMARY KEY (`id`),
UNIQUE KEY `idx` (`bucket_id`,`name`,`version_id`),
KEY `cluster_id` (`cluster_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin /*T![auto_rand_base] AUTO_RANDOM_BASE=2313570001 */
1 row in set (0.00 sec)
问题排查
看查询计划好像没有太大问题,走了联合索引做范围查找,最后回表查询返回结果,按道理这个执行计划非常好,那问题在哪里呢。
mysql> explain select name, uid, size, etag from objects where bucket_id = '.bucket.meta.cmu-bucket01' and name like 'dir_abc4999/%' order by name asc limit 501\"G
*************************** 1. row ***************************
id: Projection_7
estRows: 501.00
task: root
access object:
operator info: ogw.objects.name, ogw.objects.uid, ogw.objects.size, ogw.objects.etag
*************************** 2. row ***************************
id: └─Projection_35
estRows: 501.00
task: root
access object:
operator info: ogw.objects.bucket_id, ogw.objects.name, ogw.objects.uid, ogw.objects.size, ogw.objects.etag
*************************** 3. row ***************************
id: └─IndexLookUp_34
estRows: 501.00
task: root
access object:
operator info: limit embedded(offset:0, count:501)
*************************** 4. row ***************************
id: ├─Limit_33(Build)
estRows: 501.00
task: cop[tikv]
access object:
operator info: offset:0, count:501
*************************** 5. row ***************************
id: │ └─Selection_32
estRows: 501.00
task: cop[tikv]
access object:
operator info: like(ogw.objects.name, "dir_abc4999/%", 92)
*************************** 6. row ***************************
id: │ └─IndexRangeScan_30
estRows: 501.00
task: cop[tikv]
access object: table:objects, index:idx(bucket_id, name, version_id)
operator info: range:(".bucket.meta.cmu-bucket01" "dir",".bucket.meta.cmu-bucket01" "dis"), keep order:true
*************************** 7. row ***************************
id: └─TableRowIDScan_31(Probe)
estRows: 501.00
task: cop[tikv]
access object: table:objects
operator info: keep order:false, stats:pseudo
7 rows in set (0.00 sec)
唯一奇怪的在于上面执行计划中的第 6 行 IndexRangeScan_30 中,like 算子做范围查询的范围,计划里选择的是 [“dir”, “dis”],按道理应该是 [“dir_abc4999/”, “dir_abc49990”] 才对呀(因为 ‘/’ 下一个字符是 ‘0’),满足 [“dir”, “dis”] 范围的行数竟然达到 2 亿+ 条,难怪会这么慢。
mysql> select count(1) from objects where bucket_id = '.bucket.meta.cmu-bucket01' and name >= 'dir' and name < 'dis';
+-----------+
| count(1) |
+-----------+
| 233596134 |
+-----------+
1 row in set (7.21 sec)
从这里不免怀疑是 tidb 执行计划这里出了问题,错误的选择了 index 的 range 范围,导致扫描了不必要的 keys,难道是 tidb 执行计划的 bug 问题?
改写 sql,将出问题的 sql 改写为它的等价 sql,耗时 0.00 秒!!
mysql> select name, uid, size, etag from objects where bucket_id = '.bucket.meta.cmu-bucket01' and name >= 'dir_abc4999/' and name <=
'dir_abc49990' order by name asc limit 501;
+----------------------------------+--------+-------+----------------------------------+
| name | uid | size | etag |
+----------------------------------+--------+-------+----------------------------------+
| dir_abc4999/obj_20211027_1_0.txt | test01 | 61441 | 79702d022dafea0173dc9bcbce1b562e |
| dir_abc4999/obj_20211027_1_1.txt | test01 | 61441 | 79702d022dafea0173dc9bcbce1b562e |
| dir_abc4999/obj_20211027_1_2.txt | test01 | 61441 | 79702d022dafea0173dc9bcbce1b562e |
| dir_abc4999/obj_20211027_1_3.txt | test01 | 61441 | 79702d022dafea0173dc9bcbce1b562e |
| dir_abc4999/obj_20211027_1_4.txt | test01 | 61441 | 79702d022dafea0173dc9bcbce1b562e |
| dir_abc4999/obj_20211027_1_5.txt | test01 | 61441 | 79702d022dafea0173dc9bcbce1b562e |
| dir_abc4999/obj_20211027_1_6.txt | test01 | 61441 | 79702d022dafea0173dc9bcbce1b562e |
| dir_abc4999/obj_20211027_1_7.txt | test01 | 61441 | 79702d022dafea0173dc9bcbce1b562e |
| dir_abc4999/obj_20211027_1_8.txt | test01 | 61441 | 79702d022dafea0173dc9bcbce1b562e |
| dir_abc4999/obj_20211027_1_9.txt | test01 | 61441 | 79702d022dafea0173dc9bcbce1b562e |
+----------------------------------+--------+-------+----------------------------------+
10 rows in set (0.00 sec)
查看改写后 sql 的执行计划,执行计划相同,但是这里联合索引 name 列的范围变为了 [“dir_abc4999/”, “dir_abc49990”],这里扫描的 keys 数量紧紧为 10 个。
mysql> explain select name, uid, size, etag from objects where bucket_id = '.bucket.meta.cmu-bucket01' and name >= 'dir_abc4999/' and
name <= 'dir_abc49990' order by name asc limit 501\"G
*************************** 1. row ***************************
id: Projection_7
estRows: 1.00
task: root
access object:
operator info: ogw.objects.name, ogw.objects.uid, ogw.objects.size, ogw.objects.etag
*************************** 2. row ***************************
id: └─Projection_32
estRows: 1.00
task: root
access object:
operator info: ogw.objects.bucket_id, ogw.objects.name, ogw.objects.uid, ogw.objects.size, ogw.objects.etag
*************************** 3. row ***************************
id: └─IndexLookUp_31
estRows: 1.00
task: root
access object:
operator info: limit embedded(offset:0, count:501)
*************************** 4. row ***************************
id: ├─Limit_30(Build)
estRows: 1.00
task: cop[tikv]
access object:
operator info: offset:0, count:501
*************************** 5. row ***************************
id: │ └─IndexRangeScan_28
estRows: 1.00
task: cop[tikv]
access object: table:objects, index:idx(bucket_id, name, version_id)
operator info: range:[".bucket.meta.cmu-bucket01" "dir_abc4999/",".bucket.meta.cmu-bucket01" "dir_abc49990"], keep order:true
*************************** 6. row ***************************
id: └─TableRowIDScan_29(Probe)
estRows: 1.00
task: cop[tikv]
access object: table:objects
operator info: keep order:false, stats:pseudo
6 rows in set (0.00 sec)
看到这里似乎找到了问题的原因,是 tidb 的 like 算子不准,导致执行计划扫描过多的数据,tidb 在这里存在 bug,通过改写这个 like sql 到后面那个来规避这个问题吧!!!!
真实原因
关于 sql 的 like 语句,’_’ 符号竟然是个通配符!!!!
With
LIKE
you can use the following two wildcard characters in the pattern:
%
matches any number of characters, even zero characters._
matches exactly one character.
mysql 官方文档关于 like 语法的描述:链接
也就是说 name like ‘dir_abc4999/%’ 的条件中,’_‘ 可以为任何字符,这样的话匹配的前缀只有 “dir",tidb 选择的 range 范围恰好是正确的!!!
这里业务错误的拿 ’_‘ 符号来进行精确匹配,出现了误用 like 语法中特殊字符的问题。
这里将 ’_’ 进行转义,问题同样也就可以解决了:
mysql> select name, uid, size, etag from objects where bucket_id = '.bucket.meta.cmu-bucket01' and name like 'dir\"_abc4999/%' order by name asc limit 501;
+----------------------------------+--------+-------+----------------------------------+
| name | uid | size | etag |
+----------------------------------+--------+-------+----------------------------------+
| dir_abc4999/obj_20211027_1_0.txt | test01 | 61441 | 79702d022dafea0173dc9bcbce1b562e |
| dir_abc4999/obj_20211027_1_1.txt | test01 | 61441 | 79702d022dafea0173dc9bcbce1b562e |
| dir_abc4999/obj_20211027_1_2.txt | test01 | 61441 | 79702d022dafea0173dc9bcbce1b562e |
| dir_abc4999/obj_20211027_1_3.txt | test01 | 61441 | 79702d022dafea0173dc9bcbce1b562e |
| dir_abc4999/obj_20211027_1_4.txt | test01 | 61441 | 79702d022dafea0173dc9bcbce1b562e |
| dir_abc4999/obj_20211027_1_5.txt | test01 | 61441 | 79702d022dafea0173dc9bcbce1b562e |
| dir_abc4999/obj_20211027_1_6.txt | test01 | 61441 | 79702d022dafea0173dc9bcbce1b562e |
| dir_abc4999/obj_20211027_1_7.txt | test01 | 61441 | 79702d022dafea0173dc9bcbce1b562e |
| dir_abc4999/obj_20211027_1_8.txt | test01 | 61441 | 79702d022dafea0173dc9bcbce1b562e |
| dir_abc4999/obj_20211027_1_9.txt | test01 | 61441 | 79702d022dafea0173dc9bcbce1b562e |
+----------------------------------+--------+-------+----------------------------------+
10 rows in set (0.00 sec)