TiDB 与MySQL优化器在特定语句下执行效果对比(二)
一、引言
Index join 是表关联的常用连接方式,笔者在近期学习TiDB 的index join 过程中,发现TiDB的优化器在选择表连接策略的过程中,在某些场景下并不能选择index join 作为表的连接方式,这一点与MySQL的优化器的选择策略是不同的,下面笔者就以几个例子来展现下这种差异:
备注:笔者测试语句的表数据来源均来自tpch,可以由tiup bench tpch ...导入生成这些数据
测试的TiDB版本为6.5.3,mysql 版本为8.0.30
二、测试案例1关联条件有函数
该测试案例的信息来源于 https://asktug.com/t/topic/1013277?replies_to_post_number=9 该帖子
对应的表结构
mysql> show create table customer \G
*************************** 1. row ***************************
Table: customer
Create Table: CREATE TABLE `customer` (
`C_CUSTKEY` bigint(20) NOT NULL,
`C_NAME` varchar(25) NOT NULL,
`C_ADDRESS` varchar(40) NOT NULL,
`C_NATIONKEY` bigint(20) NOT NULL,
`C_PHONE` char(15) NOT NULL,
`C_ACCTBAL` decimal(15,2) NOT NULL,
`C_MKTSEGMENT` char(10) NOT NULL,
`C_COMMENT` varchar(117) NOT NULL,
PRIMARY KEY (`C_CUSTKEY`) /*T![clustered_index] CLUSTERED */
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin
1 row in set (0.00 sec)
mysql> show create table orders \G
*************************** 1. row ***************************
Table: orders
Create Table: CREATE TABLE `orders` (
`O_ORDERKEY` bigint(20) NOT NULL,
`O_CUSTKEY` bigint(20) NOT NULL,
`O_ORDERSTATUS` char(1) NOT NULL,
`O_TOTALPRICE` decimal(15,2) NOT NULL,
`O_ORDERDATE` date NOT NULL,
`O_ORDERPRIORITY` char(15) NOT NULL,
`O_CLERK` char(15) NOT NULL,
`O_SHIPPRIORITY` bigint(20) NOT NULL,
`O_COMMENT` varchar(79) NOT NULL,
PRIMARY KEY (`O_ORDERKEY`) /*T![clustered_index] CLUSTERED */,
KEY `index_O_TOTALPRICE` (`O_TOTALPRICE`),
KEY `index_a` (`O_ORDERSTATUS`,`O_TOTALPRICE`),
KEY `indexO_CUSTKEY` (`O_CUSTKEY`),
KEY `index_date` (`O_ORDERDATE`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin
1 row in set (0.00 sec)
原始语句及其执行计划
mysql> explain analyze select /*+ INL_JOIN(a,b) */ count(*) from customer a,orders b where a.c_phone='25-989-741-2988' and a.c_custkey=b.o_custkey and a.C_COMMENT=b.O_COMMENT;
+------------------------------------+-----------+---------+-----------+------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------+------+
| id | estRows | actRows | task | access object | execution info | operator info | memory | disk |
+------------------------------------+-----------+---------+-----------+------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------+------+
| StreamAgg_9 | 1.00 | 1 | root | | time:901.5ms, loops:2 | funcs:count(1)->Column#18 | 8 Bytes | N/A |
| └─IndexJoin_39 | 1.05 | 0 | root | | time:901.5ms, loops:1, inner:{total:183.6ms, concurrency:5, task:1, construct:247.6µs, fetch:183.2ms, build:61µs}, probe:41.2µs | inner join, inner:IndexLookUp_38, outer key:tpch2.customer.c_custkey, inner key:tpch2.orders.o_custkey, equal cond:eq(tpch2.customer.c_comment, tpch2.orders.o_comment), eq(tpch2.customer.c_custkey, tpch2.orders.o_custkey) | 65.6 KB | N/A |
| ├─TableReader_33(Build) | 1.04 | 1 | root | | time:717.2ms, loops:3, cop_task: {num: 1, max: 717.2ms, proc_keys: 150000, tot_proc: 714ms, rpc_num: 1, rpc_time: 717.1ms, copr_cache: disabled, distsql_concurrency: 15} | data:Selection_32 | 423 Bytes | N/A |
| │ └─Selection_32 | 1.04 | 1 | cop[tikv] | | tikv_task:{time:714ms, loops:151}, scan_detail: {total_process_keys: 150000, total_process_keys_size: 30533765, total_keys: 150001, get_snapshot_time: 647.8µs, rocksdb: {key_skipped_count: 150000, block: {cache_hit_count: 5, read_count: 495, read_byte: 11.1 MB, read_time: 345.8ms}}} | eq(tpch2.customer.c_phone, "25-989-741-2988") | N/A | N/A |
| │ └─TableFullScan_31 | 150000.00 | 150000 | cop[tikv] | table:a | tikv_task:{time:694ms, loops:151} | keep order:false | N/A | N/A |
| └─IndexLookUp_38(Probe) | 1.05 | 6 | root | | time:183ms, loops:2, index_task: {total_time: 1.84ms, fetch_handle: 1.83ms, build: 670ns, wait: 1.09µs}, table_task: {total_time: 181.1ms, num: 1, concurrency: 5}, next: {wait_index: 2.01ms, wait_table_lookup_build: 0s, wait_table_lookup_resp: 181ms} | | 9.34 KB | N/A |
| ├─IndexRangeScan_36(Build) | 1.05 | 6 | cop[tikv] | table:b, index:indexO_CUSTKEY(O_CUSTKEY) | time:1.83ms, loops:3, cop_task: {num: 1, max: 1.79ms, proc_keys: 6, rpc_num: 1, rpc_time: 1.77ms, copr_cache: disabled, distsql_concurrency: 15}, tikv_task:{time:1ms, loops:1}, scan_detail: {total_process_keys: 6, total_process_keys_size: 276, total_keys: 7, get_snapshot_time: 888.5µs, rocksdb: {key_skipped_count: 6, block: {cache_hit_count: 2}}} | range: decided by [eq(tpch2.orders.o_custkey, tpch2.customer.c_custkey)], keep order:false | N/A | N/A |
| └─TableRowIDScan_37(Probe) | 1.05 | 6 | cop[tikv] | table:b | time:181ms, loops:2, cop_task: {num: 3, max: 85.9ms, min: 35.3ms, avg: 60.3ms, p95: 85.9ms, max_proc_keys: 2, p95_proc_keys: 2, tot_proc: 174ms, rpc_num: 3, rpc_time: 180.8ms, copr_cache: disabled, distsql_concurrency: 15}, tikv_task:{proc max:83ms, min:34ms, avg: 58.7ms, p80:83ms, p95:83ms, iters:3, tasks:3}, scan_detail: {total_process_keys: 6, total_process_keys_size: 929, total_keys: 6, get_snapshot_time: 2.35ms, rocksdb: {block: {cache_hit_count: 9, read_count: 18, read_byte: 3.76 MB, read_time: 170.7ms}}} | keep order:false | N/A | N/A |
+------------------------------------+-----------+---------+-----------+------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------+------+
8 rows in set (0.94 sec)
从该执行计划来看,还是可以走正常的index join执行计划,但是如果把关联条件加个函数呢?语句如下
select count(*) from customer a,orders b where a.c_phone='25-989-741-2988' and a.c_custkey=b.o_custkey and substr(a.C_COMMENT,1,10)=substr(b.O_COMMENT,1,10);
再跑一遍执行计划
mysql> explain analyze select count(*) from customer a,orders b where a.c_phone='25-989-741-2988' and a.c_custkey=b.o_custkey and substr(a.C_COMMENT,1,10)=substr(b.O_COMMENT,1,10);
+----------------------------------+------------+---------+-----------+---------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------+-----------+---------+
| id | estRows | actRows | task | access object | execution info | operator info | memory | disk |
+----------------------------------+------------+---------+-----------+---------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------+-----------+---------+
| StreamAgg_11 | 1.00 | 1 | root | | time:1.51s, loops:2 | funcs:count(1)->Column#18 | 8 Bytes | N/A |
| └─HashJoin_22 | 1.05 | 0 | root | | time:1.51s, loops:1, build_hash_table:{total:154.5ms, fetch:154.5ms, build:5.21µs}, probe:{concurrency:5, total:7.54s, max:1.51s, probe:190.6ms, fetch:7.35s} | inner join, equal:[eq(tpch2.customer.c_custkey, tpch2.orders.o_custkey) eq(Column#19, Column#20)] | 25.7 KB | 0 Bytes |
| ├─Projection_14(Build) | 1.04 | 1 | root | | time:154.4ms, loops:2, Concurrency:OFF | tpch2.customer.c_custkey, substr(tpch2.customer.c_comment, 1, 10)->Column#19 | 40.7 KB | N/A |
| │ └─TableReader_17 | 1.04 | 1 | root | | time:154.4ms, loops:2, cop_task: {num: 1, max: 169.2ms, proc_keys: 150000, tot_proc: 167ms, rpc_num: 1, rpc_time: 169.2ms, copr_cache: disabled, distsql_concurrency: 15} | data:Selection_16 | 407 Bytes | N/A |
| │ └─Selection_16 | 1.04 | 1 | cop[tikv] | | tikv_task:{time:167ms, loops:151}, scan_detail: {total_process_keys: 150000, total_process_keys_size: 30533765, total_keys: 150001, get_snapshot_time: 914µs, rocksdb: {key_skipped_count: 150000, block: {cache_hit_count: 500}}} | eq(tpch2.customer.c_phone, "25-989-741-2988") | N/A | N/A |
| │ └─TableFullScan_15 | 150000.00 | 150000 | cop[tikv] | table:a | tikv_task:{time:139ms, loops:151} | keep order:false | N/A | N/A |
| └─Projection_18(Probe) | 1498900.00 | 1498900 | root | | time:1.36s, loops:1469, Concurrency:5 | tpch2.orders.o_custkey, substr(tpch2.orders.o_comment, 1, 10)->Column#20 | 486.3 KB | N/A |
| └─TableReader_20 | 1498900.00 | 1498900 | root | | time:855.3ms, loops:1469, cop_task: {num: 55, max: 184.1ms, min: 1.64ms, avg: 61.8ms, p95: 171.1ms, max_proc_keys: 50144, p95_proc_keys: 50144, tot_proc: 1.38s, tot_wait: 16ms, rpc_num: 55, rpc_time: 3.4s, copr_cache: disabled, distsql_concurrency: 15} | data:TableFullScan_19 | 6.18 MB | N/A |
| └─TableFullScan_19 | 1498900.00 | 1498900 | cop[tikv] | table:b | tikv_task:{proc max:62ms, min:0s, avg: 20.4ms, p80:38ms, p95:48ms, iters:1682, tasks:55}, scan_detail: {total_process_keys: 1498900, total_process_keys_size: 227047584, total_keys: 1498955, get_snapshot_time: 6.04ms, rocksdb: {key_skipped_count: 1498900, block: {cache_hit_count: 3882}}} | keep order:false | N/A | N/A |
+----------------------------------+------------+---------+-----------+---------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------+-----------+---------+
9 rows in set (1.52 sec)
从上面的执行计划可以看到,执行计划已经发生了改变,变成了hash join,那么如果利用sql hint 来强行指定index join的使用是否可行呢?
mysql> explain analyze select /*+ INL_JOIN(a,b) */ count(*) from customer a,orders b where a.c_phone='25-989-741-2988' and a.c_custkey=b.o_custkey and substr(a.C_COMMENT,1,10)=substr(b.O_COMMENT,1,10);
+----------------------------------+------------+---------+-----------+---------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------+-----------+---------+
| id | estRows | actRows | task | access object | execution info | operator info | memory | disk |
+----------------------------------+------------+---------+-----------+---------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------+-----------+---------+
| StreamAgg_11 | 1.00 | 1 | root | | time:1.41s, loops:2 | funcs:count(1)->Column#18 | 8 Bytes | N/A |
| └─HashJoin_22 | 1.05 | 0 | root | | time:1.41s, loops:1, build_hash_table:{total:84.2ms, fetch:84.2ms, build:6.54µs}, probe:{concurrency:5, total:7.03s, max:1.41s, probe:176.7ms, fetch:6.85s} | inner join, equal:[eq(tpch2.customer.c_custkey, tpch2.orders.o_custkey) eq(Column#19, Column#20)] | 25.7 KB | 0 Bytes |
| ├─Projection_14(Build) | 1.04 | 1 | root | | time:84ms, loops:2, Concurrency:OFF | tpch2.customer.c_custkey, substr(tpch2.customer.c_comment, 1, 10)->Column#19 | 40.7 KB | N/A |
| │ └─TableReader_17 | 1.04 | 1 | root | | time:84ms, loops:2, cop_task: {num: 1, max: 84.6ms, proc_keys: 150000, tot_proc: 81ms, rpc_num: 1, rpc_time: 84.5ms, copr_cache: disabled, distsql_concurrency: 15} | data:Selection_16 | 403 Bytes | N/A |
| │ └─Selection_16 | 1.04 | 1 | cop[tikv] | | tikv_task:{time:81ms, loops:151}, scan_detail: {total_process_keys: 150000, total_process_keys_size: 30533765, total_keys: 150001, get_snapshot_time: 755.9µs, rocksdb: {key_skipped_count: 150000, block: {cache_hit_count: 503}}} | eq(tpch2.customer.c_phone, "25-989-741-2988") | N/A | N/A |
| │ └─TableFullScan_15 | 150000.00 | 150000 | cop[tikv] | table:a | tikv_task:{time:71ms, loops:151} | keep order:false | N/A | N/A |
| └─Projection_18(Probe) | 1498900.00 | 1498900 | root | | time:1.32s, loops:1468, Concurrency:5 | tpch2.orders.o_custkey, substr(tpch2.orders.o_comment, 1, 10)->Column#20 | 481.3 KB | N/A |
| └─TableReader_20 | 1498900.00 | 1498900 | root | | time:807.3ms, loops:1468, cop_task: {num: 55, max: 209.3ms, min: 1.55ms, avg: 64.1ms, p95: 171ms, max_proc_keys: 50144, p95_proc_keys: 50144, tot_proc: 1.41s, tot_wait: 7ms, rpc_num: 55, rpc_time: 3.52s, copr_cache: disabled, distsql_concurrency: 15} | data:TableFullScan_19 | 9.26 MB | N/A |
| └─TableFullScan_19 | 1498900.00 | 1498900 | cop[tikv] | table:b | tikv_task:{proc max:59ms, min:0s, avg: 20ms, p80:44ms, p95:54ms, iters:1682, tasks:55}, scan_detail: {total_process_keys: 1498900, total_process_keys_size: 227047584, total_keys: 1498955, get_snapshot_time: 3.97ms, rocksdb: {key_skipped_count: 1498900, block: {cache_hit_count: 4047}}} | keep order:false | N/A | N/A |
+----------------------------------+------------+---------+-----------+---------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------+-----------+---------+
9 rows in set, 2 warnings (1.41 sec)
从执行计划来看,还是没有走index join,还是选择了hash join,所以根据执行结果来看,tidb在对这种关联条件使用了函数的情况下是无法走index join的,这一点需要优化,如果遇到1个小表和1个大表做关联,且关联字段上有索引,那么这种情况下走index join 比hash join 的效率是要高的
那么MySQL的执行计划是怎么样的?
mysql> explain analyze select count(*) from customer a,orders b where a.c_phone='25-989-741-2988' and a.c_custkey=b.o_custkey and substr(a.C_COMMENT,1,10)=substr(b.O_COMMENT,1,10); +--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| EXPLAIN |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| -> Aggregate: count(0) (cost=116149.61 rows=1) (actual time=48.579..48.580 rows=1 loops=1)
-> Nested loop inner join (cost=93735.93 rows=224137) (actual time=48.575..48.575 rows=0 loops=1)
-> Filter: (a.C_PHONE = '25-989-741-2988') (cost=15275.30 rows=14802) (actual time=0.046..48.508 rows=1 loops=1)
-> Table scan on a (cost=15275.30 rows=148023) (actual time=0.044..39.239 rows=150000 loops=1)
-> Filter: (substr(a.C_COMMENT,1,10) = substr(b.O_COMMENT,1,10)) (cost=3.79 rows=15) (actual time=0.064..0.064 rows=0 loops=1)
-> Index lookup on b using indexO_CUSTKEY (O_CUSTKEY=a.C_CUSTKEY) (cost=3.79 rows=15) (actual time=0.060..0.062 rows=6 loops=1)
|
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.05 sec)
由此可以看到 MySQL在遇到关联条件上有用到函数的情况下,照样是可以走Index join的
三、测试案例2关联条件是非等值关联条件
对应的表结构
*************************** 1. row ***************************
Table: orders
Create Table: CREATE TABLE `orders` (
`O_ORDERKEY` bigint(20) NOT NULL,
`O_CUSTKEY` bigint(20) NOT NULL,
`O_ORDERSTATUS` char(1) NOT NULL,
`O_TOTALPRICE` decimal(15,2) NOT NULL,
`O_ORDERDATE` date NOT NULL,
`O_ORDERPRIORITY` char(15) NOT NULL,
`O_CLERK` char(15) NOT NULL,
`O_SHIPPRIORITY` bigint(20) NOT NULL,
`O_COMMENT` varchar(79) NOT NULL,
PRIMARY KEY (`O_ORDERKEY`) /*T![clustered_index] CLUSTERED */,
KEY `index_O_TOTALPRICE` (`O_TOTALPRICE`),
KEY `index_a` (`O_ORDERSTATUS`,`O_TOTALPRICE`),
KEY `indexO_CUSTKEY` (`O_CUSTKEY`),
KEY `index_date` (`O_ORDERDATE`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin
1 row in set (0.00 sec)
mysql> show create table lineitem\G
*************************** 1. row ***************************
Table: lineitem
Create Table: CREATE TABLE `lineitem` (
`L_ORDERKEY` bigint(20) NOT NULL,
`L_PARTKEY` bigint(20) NOT NULL,
`L_SUPPKEY` bigint(20) NOT NULL,
`L_LINENUMBER` bigint(20) NOT NULL,
`L_QUANTITY` decimal(15,2) NOT NULL,
`L_EXTENDEDPRICE` decimal(15,2) NOT NULL,
`L_DISCOUNT` decimal(15,2) NOT NULL,
`L_TAX` decimal(15,2) NOT NULL,
`L_RETURNFLAG` char(1) NOT NULL,
`L_LINESTATUS` char(1) NOT NULL,
`L_SHIPDATE` date NOT NULL,
`L_COMMITDATE` date NOT NULL,
`L_RECEIPTDATE` date NOT NULL,
`L_SHIPINSTRUCT` char(25) NOT NULL,
`L_SHIPMODE` char(10) NOT NULL,
`L_COMMENT` varchar(44) NOT NULL,
PRIMARY KEY (`L_ORDERKEY`,`L_LINENUMBER`) /*T![clustered_index] CLUSTERED */,
KEY `indexL_SHIPDATE` (`L_SHIPDATE`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin
1 row in set (0.00 sec)
语句
select count(*) from orders a join lineitem b on a.O_ORDERDATE>b.L_SHIPDATE where a.O_TOTALPRICE > 500000;
从语句可以看出,该语句的关联条件是非等值的关联条件
使用sql hint 强制让tidb使用index join,看是否可行
mysql> explain analyze select /*+ INL_JOIN(a,b) */ count(*) from orders a join lineitem b on a.O_ORDERDATE>b.L_SHIPDATE where a.O_TOTALPRICE > 500000;
+--------------------------------------+---------------+----------+-----------+-------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------+---------+---------+
| id | estRows | actRows | task | access object | execution info | operator info | memory | disk |
+--------------------------------------+---------------+----------+-----------+-------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------+---------+---------+
| StreamAgg_10 | 1.00 | 1 | root | | time:19s, loops:2 | funcs:count(1)->Column#26 | 8 Bytes | N/A |
| └─HashJoin_24 | 1110028510.72 | 40096990 | root | | time:18.2s, loops:39161, build_hash_table:{total:9.85ms, fetch:9.84ms, build:9.64µs}, probe:{concurrency:5, total:1m35.2s, max:19s, probe:1m28.4s, fetch:6.81s} | CARTESIAN inner join, other cond:gt(tpch2.orders.o_orderdate, tpch2.lineitem.l_shipdate) | 12.1 KB | 0 Bytes |
| ├─IndexLookUp_18(Build) | 184.97 | 16 | root | | time:9.82ms, loops:2, index_task: {total_time: 3.09ms, fetch_handle: 3.09ms, build: 596ns, wait: 1.4µs}, table_task: {total_time: 6.26ms, num: 1, concurrency: 5}, next: {wait_index: 3.6ms, wait_table_lookup_build: 0s, wait_table_lookup_resp: 6.21ms} | | 10.5 KB | N/A |
| │ ├─IndexRangeScan_16(Build) | 184.97 | 16 | cop[tikv] | table:a, index:index_O_TOTALPRICE(O_TOTALPRICE) | time:3.09ms, loops:3, cop_task: {num: 2, max: 2.98ms, min: 1.96ms, avg: 2.47ms, p95: 2.98ms, max_proc_keys: 16, p95_proc_keys: 16, rpc_num: 2, rpc_time: 4.92ms, copr_cache: disabled, distsql_concurrency: 15}, tikv_task:{proc max:0s, min:0s, avg: 0s, p80:0s, p95:0s, iters:2, tasks:2}, scan_detail: {total_process_keys: 16, total_process_keys_size: 736, total_keys: 18, get_snapshot_time: 11.8µs, rocksdb: {key_skipped_count: 16, block: {cache_hit_count: 4}}} | range:(500000.00,+inf], keep order:false | N/A | N/A |
| │ └─TableRowIDScan_17(Probe) | 184.97 | 16 | cop[tikv] | table:a | time:6.09ms, loops:2, cop_task: {num: 3, max: 4.08ms, min: 645.5µs, avg: 1.96ms, p95: 4.08ms, max_proc_keys: 8, p95_proc_keys: 8, rpc_num: 3, rpc_time: 5.85ms, copr_cache: disabled, distsql_concurrency: 15}, tikv_task:{proc max:2ms, min:0s, avg: 666.7µs, p80:2ms, p95:2ms, iters:3, tasks:3}, scan_detail: {total_process_keys: 16, total_process_keys_size: 2446, total_keys: 16, get_snapshot_time: 55.2µs, rocksdb: {block: {cache_hit_count: 35}}} | keep order:false | N/A | N/A |
| └─IndexReader_22(Probe) | 6001215.00 | 6001215 | root | | time:266.1ms, loops:5872, cop_task: {num: 193, max: 952ms, min: 1.71ms, avg: 309ms, p95: 766.9ms, max_proc_keys: 50144, p95_proc_keys: 50144, tot_proc: 5.04s, tot_wait: 72ms, rpc_num: 194, rpc_time: 59.6s, copr_cache: disabled, distsql_concurrency: 15}, backoff{regionMiss: 2ms} | index:IndexFullScan_21 | 2.30 MB | N/A |
| └─IndexFullScan_21 | 6001215.00 | 6001215 | cop[tikv] | table:b, index:indexL_SHIPDATE(L_SHIPDATE) | tikv_task:{proc max:102ms, min:0s, avg: 25.1ms, p80:45ms, p95:69ms, iters:6622, tasks:193}, scan_detail: {total_process_keys: 6001215, total_process_keys_size: 342069255, total_keys: 6001408, get_snapshot_time: 17.1ms, rocksdb: {key_skipped_count: 6001215, block: {cache_hit_count: 428, read_count: 4684, read_byte: 27.1 MB, read_time: 60.7ms}}} | keep order:false | N/A | N/A |
+--------------------------------------+---------------+----------+-----------+-------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------+---------+---------+
7 rows in set, 2 warnings (19.05 sec)
从上面的执行计划可以看出,即使强制让优化器走index join 也是不行的
但是如果我们把关联条件换成等值关联条件呢?
mysql> explain analyze select /*+ INL_JOIN(a,b) */ count(*) from orders a join lineitem b on a.O_ORDERDATE=b.L_SHIPDATE where a.O_TOTALPRICE > 500000;
+--------------------------------------+-----------+---------+-----------+-------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+------+
| id | estRows | actRows | task | access object | execution info | operator info | memory | disk |
+--------------------------------------+-----------+---------+-----------+-------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+------+
| StreamAgg_10 | 1.00 | 1 | root | | time:58.9ms, loops:2 | funcs:count(1)->Column#26 | 8 Bytes | N/A |
| └─IndexJoin_48 | 439441.22 | 39751 | root | | time:58.2ms, loops:40, inner:{total:50.1ms, concurrency:5, task:1, construct:19.4µs, fetch:44.6ms, build:5.47ms}, probe:1.17ms | inner join, inner:IndexReader_47, outer key:tpch2.orders.o_orderdate, inner key:tpch2.lineitem.l_shipdate, equal cond:eq(tpch2.orders.o_orderdate, tpch2.lineitem.l_shipdate) | 398.1 KB | N/A |
| ├─IndexLookUp_40(Build) | 184.97 | 16 | root | | time:6.77ms, loops:3, index_task: {total_time: 2.72ms, fetch_handle: 2.72ms, build: 578ns, wait: 1.24µs}, table_task: {total_time: 3.98ms, num: 1, concurrency: 5}, next: {wait_index: 2.85ms, wait_table_lookup_build: 0s, wait_table_lookup_resp: 3.91ms} | | 2.52 KB | N/A |
| │ ├─IndexRangeScan_38(Build) | 184.97 | 16 | cop[tikv] | table:a, index:index_O_TOTALPRICE(O_TOTALPRICE) | time:2.72ms, loops:3, cop_task: {num: 2, max: 2.64ms, min: 2.63ms, avg: 2.64ms, p95: 2.64ms, max_proc_keys: 16, p95_proc_keys: 16, rpc_num: 2, rpc_time: 5.24ms, copr_cache: disabled, distsql_concurrency: 15}, tikv_task:{proc max:1ms, min:0s, avg: 500µs, p80:1ms, p95:1ms, iters:2, tasks:2}, scan_detail: {total_process_keys: 16, total_process_keys_size: 736, total_keys: 18, get_snapshot_time: 2.11ms, rocksdb: {key_skipped_count: 16, block: {cache_hit_count: 4}}} | range:(500000.00,+inf], keep order:false | N/A | N/A |
| │ └─TableRowIDScan_39(Probe) | 184.97 | 16 | cop[tikv] | table:a | time:3.9ms, loops:2, cop_task: {num: 3, max: 1.47ms, min: 1.16ms, avg: 1.28ms, p95: 1.47ms, max_proc_keys: 8, p95_proc_keys: 8, rpc_num: 3, rpc_time: 3.8ms, copr_cache: disabled, distsql_concurrency: 15}, tikv_task:{proc max:0s, min:0s, avg: 0s, p80:0s, p95:0s, iters:3, tasks:3}, scan_detail: {total_process_keys: 16, total_process_keys_size: 2446, total_keys: 16, get_snapshot_time: 1.79ms, rocksdb: {block: {cache_hit_count: 35}}} | keep order:false | N/A | N/A |
| └─IndexReader_47(Probe) | 439441.22 | 39751 | root | | time:43.8ms, loops:43, cop_task: {num: 27, max: 13.6ms, min: 1.05ms, avg: 5.65ms, p95: 13.1ms, max_proc_keys: 5470, p95_proc_keys: 5088, tot_proc: 23ms, tot_wait: 14ms, rpc_num: 28, rpc_time: 153.1ms, copr_cache: disabled, distsql_concurrency: 15}, backoff{regionMiss: 2ms} | index:IndexRangeScan_46 | 83.0 KB | N/A |
| └─IndexRangeScan_46 | 439441.22 | 39751 | cop[tikv] | table:b, index:indexL_SHIPDATE(L_SHIPDATE) | tikv_task:{proc max:6ms, min:0s, avg: 1.44ms, p80:3ms, p95:4ms, iters:142, tasks:27}, scan_detail: {total_process_keys: 39751, total_process_keys_size: 2265807, total_keys: 39789, get_snapshot_time: 8.71ms, rocksdb: {key_skipped_count: 39751, block: {cache_hit_count: 60, read_count: 47, read_byte: 278.1 KB, read_time: 744.3µs}}} | range: decided by [eq(tpch2.lineitem.l_shipdate, tpch2.orders.o_orderdate)], keep order:false | N/A | N/A |
+--------------------------------------+-----------+---------+-----------+-------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+------+
7 rows in set (0.06 sec)
从上述执行计划可知,把关联条件换成等值的就可以走Index join了
那么MySQL的执行计划是怎么样的?
mysql> explain analyze select count(*) from orders a join lineitem b on a.O_ORDERDATE>b.L_SHIPDATE where a.O_TOTALPRICE > 500000;
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| EXPLAIN |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| -> Aggregate: count(0) (cost=12395385.30 rows=1) (actual time=16682.222..16682.222 rows=1 loops=1)
-> Nested loop inner join (cost=9310532.80 rows=30848525) (actual time=0.549..15124.670 rows=40096992 loops=1)
-> Index range scan on a using O_TOTALPRICE over (500000.00 < O_TOTALPRICE), with index condition: (a.O_TOTALPRICE > 500000.00) (cost=7.64 rows=16) (actual time=0.093..2.346 rows=16 loops=1)
-> Filter: (a.O_ORDERDATE > b.L_SHIPDATE) (cost=15490.33 rows=1928033) (actual time=0.120..840.187 rows=2506062 loops=16)
-> Covering index range scan on b (re-planned for each iteration) (cost=15490.33 rows=5784677) (actual time=0.113..664.007 rows=2506062 loops=16)
|
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (16.69 sec)
发现MySQL在遇到这种情况下走index join 依然是支持的
四、结论
根据一些测试案例我们可知,TiDB 在某些语句下,优化器目前的选择策略还不是很完善,这会导致SQL语句的运行效率稍许不足,希望产品在后续优化器方面能够继续加强。