测试说明
之前测试对比过 numa 对 tidb 性能的影响,看到 Gin 分享的《单机 8 个 NUMA node 如何玩转 TiDB - AMD EPYC 服务器上的 TiDB 集群最优部署拓扑探索》,也把自己做的测试数据分享一下吧。
TiDB 机器实际配置:
TiKV 机器实际配置:
TiKV 机器数量是三台,两块 nvme 磁盘,每块磁盘部署一个 tikv,所以总共 6 个 tikv 节点。
TiDB 机器数根据测试有变化,可以参考各个测试项。
TiDB、TiKV 机器都是 2 个 numa node。
其他组件例如 pd 另外部署在其他机器。
版本 4.0.11 + 非 NUMA 1 TiDB + 非 NUMA TiKV
oltp_point_select:
thread tps qps latency
150 220972.20 220972.20 1.70ms
300 192520.54 192520.54 7.30ms
600 193806.74 193806.74 17.01ms
900 194468.83 194468.83 26.68ms
oltp_update_non_index:
thread tps qps latency
150 45906.75 45906.75 5.77ms
300 64940.86 64940.86 8.58ms
600 77195.00 77195.00 14.46ms
900 81585.65 81585.65 20.37ms
oltp_update_index:
thread tps qps latency
150 31361.24 31361.24 8.90ms
300 44313.18 44313.18 12.52ms
600 51690.41 51690.41 21.11ms
900 55043.92 55043.92 29.72ms
oltp_read_write:
thread tps qps latency
150 2717.77 54355.35 82.96ms
300 2709.89 54197.82 173.58ms
600 2709.29 54185.81 350.33ms
900 2671.25 53425.00 539.71ms
版本 5.3.0 + 非 NUMA 1 TiDB + 非 NUMA TiKV
oltp_point_select:
thread tps qps latency
150 201794.63 201794.63 1.67ms
300 249986.45 249986.45 2.86ms
600 286983.16 286983.16 5.18ms
900 298627.92 298627.92 7.70ms
1200 287636.99 287636.99 10.27ms
oltp_update_non_index:
thread tps qps latency
150 45258.93 45258.93 6.09ms
300 62606.41 62606.41 9.22ms
600 74409.73 74409.73 15.27ms
900 79299.29 79299.29 20.74ms
1200 79375.18 79375.18 25.74ms
oltp_update_index:
thread tps qps latency
150 34082.88 34082.88 7.98ms
300 45721.24 45721.24 12.75ms
600 54530.76 54530.76 20.74ms
900 58167.30 58167.30 28.67ms
1200 58458.73 58458.73 35.59ms
oltp_read_write:
thread tps qps latency
150 3830.01 76600.21 51.02ms
300 3983.72 79674.37 99.33ms
600 3975.55 79510.95 204.11ms
900 3981.51 79630.26 314.45ms
1200 3881.14 77622.70 411.96ms
版本 5.3.0 + 非 NUMA 1 TiDB + 非 NUMA TiKV + haproxy
oltp_point_select:
thread tps qps latency
150 104443.54 104443.54 1.93ms
300 103219.71 103219.71 3.43ms
600 101954.57 101954.57 6.67ms
900 103732.70 103732.70 9.73ms
1200 102202.44 102202.44 12.98ms
oltp_update_non_index:
thread tps qps latency
150 44810.25 44810.25 6.67ms
300 61916.12 61916.12 10.27ms
600 72541.68 72541.68 17.63ms
900 75804.77 75804.77 23.95ms
1200 78351.98 78351.98 27.17ms
oltp_update_index:
thread tps qps latency
150 34619.41 34619.41 8.13ms
300 45416.08 45416.08 13.46ms
600 54170.32 54170.32 21.11ms
900 58048.60 58048.60 29.72ms
1200 58197.27 58197.27 36.24ms
oltp_read_write:
thread tps qps latency
150 3673.13 73462.65 51.94ms
300 3726.16 74523.21 101.13ms
600 3663.13 73262.64 207.82ms
900 3632.19 72643.83 314.45ms
1200 3594.83 71896.66 419.45ms
版本 5.3.0 + 非 NUMA 同server 2 TiDB + 非 NUMA TiKV
双 sysben 压测两个 tidb server,所以总结果是:线程 * 2, tps * 2.
oltp_point_select:
thread tps qps latency
150 129097.99 129097.99 3.82ms
300 144803.92 144803.92 7.56ms
600 159357.54 159357.54 13.22ms
900 166837.99 166837.99 17.63ms
1200 162728.81 162728.81 21.50ms
oltp_update_non_index:
thread tps qps latency
150 30795.56 30795.56 12.08ms
300 40467.50 40467.50 18.95ms
600 46298.63 46298.63 31.37ms
900 49060.75 49060.75 42.61ms
1200 49017.78 49017.78 52.89ms
oltp_update_index:
thread tps qps latency
150 22605.55 22605.55 15.00ms
300 27828.23 27828.23 23.95ms
600 31263.76 31263.76 40.37ms
900 33023.44 33023.44 57.87ms
1200 32931.74 32931.74 73.13ms
oltp_read_write:
thread tps qps latency
150 2281.06 45621.12 102.97ms
300 2346.95 46939.09 193.38ms
600 2351.09 47021.81 383.33ms
900 2341.98 46839.64 580.02ms
1200 2278.33 45566.56 773.68ms
版本 5.3.0 + 非 NUMA 2 server 2 TiDB + 非 NUMA TiKV
双 sysben 压测两个 tidb server,所以总结果是:线程 * 2, tps * 2。
oltp_point_select:
thread tps qps latency
150 205860.30 205860.30 1.44ms
300 248448.90 248448.90 2.57ms
600 282732.66 282732.66 4.91ms
900 295359.30 295359.30 7.30ms
1200 288584.38 288584.38 9.73ms
oltp_update_non_index:
thread tps qps latency
150 43027.55 43027.55 5.67ms
300 62083.56 62083.56 8.58ms
600 75400.60 75400.60 15.55ms
900 79980.77 79980.77 22.69ms
1200 80440.81 80440.81 28.16ms
oltp_update_index:
thread tps qps latency
150 26672.14 26672.14 9.56ms
300 32308.88 32308.88 16.12ms
600 36021.56 36021.56 28.67ms
900 39111.05 39111.05 41.85ms
1200 39442.23 39442.23 51.94ms
oltp_read_write:
thread tps qps latency
150 3910.85 78217.08 49.21ms
300 4021.25 80425.04 97.55ms
600 4008.98 80179.64 200.47ms
900 4008.64 80172.76 314.45ms
1200 3908.37 78167.48 419.45ms
跟单 server 单 tidb 对比,从这里结果看的话,只有 oltp_update_index 稍微有点低,其他几项瓶颈应该还都在 tidb 里面。
版本 5.3.0 + 非 NUMA 3 server 3 TiDB + 非 NUMA TiKV
三 sysben 压测三个 tidb server,所以总结果是:线程 * 3, tps * 3。
oltp_point_select:
thread tps qps latency
150 203556.48 203556.48 1.47ms
300 243697.05 243697.05 2.61ms
600 271592.87 271592.87 4.91ms
900 289141.86 289141.86 7.43ms
1200 275135.88 275135.88 9.91ms
oltp_update_non_index:
thread tps qps latency
150 36042.49 36042.49 6.55ms
300 48446.07 48446.07 11.04ms
600 67812.67 67812.67 16.41ms
900 59744.62 59744.62 25.74ms
1200 59889.53 59889.53 31.37ms
oltp_update_index:
thread tps qps latency
150 20492.52 20492.52 12.30ms
300 23783.25 23783.25 20.37ms
600 26927.57 26927.57 38.94ms
900 29216.96 29216.96 54.83ms
1200 28867.09 28867.09 68.05ms
oltp_read_write:
thread tps qps latency
150 3832.19 76643.70 51.02ms
300 3992.98 79859.69 97.55ms
600 4020.22 80404.41 204.11ms
900 4017.62 80352.36 308.84ms
1200 3907.52 78150.36 419.45ms
这几个压测脚本,就 oltp_update_index 到瓶颈了,其他的都跟机器数有关,增加 tidb server 单台 tidb server 数据还是不错的。oltp_update_index 下磁盘写达到 400MB/s.
版本 5.3.0 + 非 NUMA 3 server 3 TiDB + NUMA TiKV
三 sysben 压测三个 tidb server,所以总结果是:线程 * 3, tps * 3。
oltp_point_select:
thread tps qps latency
150 211067.21 211067.21 1.44ms
300 247168.48 247168.48 2.52ms
600 277333.81 277333.81 4.91ms
900 290400.36 290400.36 7.43ms
1200 275743.13 275743.13 10.09ms
oltp_update_non_index:
thread tps qps latency
150 40187.92 40187.92 5.99ms
300 55830.78 55830.78 8.74ms
600 62626.64 62626.64 16.12ms
900 83426.79 83426.79 20.37ms
1200 82545.60 82545.60 25.74ms
oltp_update_index:
thread tps qps latency
150 23585.74 23585.74 10.09ms
300 28199.69 28199.69 16.41ms
600 33693.71 33693.71 29.72ms
900 35326.38 35326.38 44.17ms
1200 36183.60 36183.60 53.85ms
oltp_read_write:
thread tps qps latency
150 3792.95 75859.05 50.11ms
300 3924.42 78488.50 99.33ms
600 3922.51 78450.28 207.82ms
900 3900.83 78016.62 320.17ms
1200 3838.47 76769.41 427.07ms
对比 tikv numa 绑定前后,oltp_update_non_index 提升 23%,oltp_update_index 提升 21%。oltp_update_index 磁盘写可以达到 450~500MB/s。
版本 5.3.0 + 同 server 非 NUMA 2 TiDB + NUMA TiKV
双 sysben 压测两个 tidb server,所以总结果是:线程 * 2, tps * 2
oltp_point_select:
thread tps qps latency
150 127537.53 127537.53 3.96ms
300 144810.09 144810.09 7.56ms
600 159019.90 159019.90 13.46ms
900 165215.00 165215.00 17.95ms
1200 161593.24 161593.24 21.89ms
oltp_update_non_index:
thread tps qps latency
150 34052.91 34052.91 10.65ms
300 43819.14 43819.14 17.63ms
600 46888.51 46888.51 30.81ms
900 51243.30 51243.30 43.39ms
1200 50359.85 50359.85 55.82ms
oltp_update_index:
thread tps qps latency
150 23838.31 23838.31 15.00ms
300 29755.96 29755.96 23.10ms
600 32775.32 32775.32 38.94ms
900 35657.68 35657.68 54.83ms
1200 36145.33 36145.33 69.29ms
oltp_read_write:
thread tps qps latency
150 2253.56 45071.11 102.97ms
300 2361.93 47238.52 186.54ms
600 2389.84 47796.78 363.18ms
900 2360.46 47209.23 539.71ms
1200 2349.10 46982.08 733.00ms
版本 5.3.0 + 非 NUMA 1 TiDB + NUMA TiKV
oltp_point_select:
thread tps qps latency
150 213835.36 213835.36 1.50ms
300 256472.60 256472.60 2.66ms
600 288847.76 288847.76 5.09ms
900 293929.98 293929.98 7.98ms
1200 279403.60 279403.60 11.45ms
oltp_update_non_index:
thread tps qps latency
150 46360.10 46360.10 6.55ms
300 64537.38 64537.38 10.09ms
600 76067.22 76067.22 17.63ms
900 80170.12 80170.12 23.10ms
1200 79644.44 79644.44 27.17ms
oltp_update_index:
thread tps qps latency
150 37550.66 37550.66 7.04ms
300 49270.84 49270.84 12.75ms
600 58444.76 58444.76 21.50ms
900 62093.82 62093.82 28.16ms
1200 62674.65 62674.65 35.59ms
oltp_read_write:
thread tps qps latency
150 3810.59 76211.77 50.11ms
300 3877.39 77547.83 101.13ms
600 3855.89 77117.70 211.60ms
900 3823.63 76472.69 325.98ms
1200 3714.32 74286.39 434.83ms
双 TiDB 比单 TiDB 性能有 15%~22% 提升。
版本 5.3.0 + 同 server NUMA 2 TiDB + NUMA TiKV
双 sysben 压测两个 tidb server,所以总结果是:线程 * 2, tps * 2
oltp_point_select:
thread tps qps latency
150 215389.87 215389.87 1.47ms
300 230722.04 230722.04 3.02ms
600 236667.82 236667.82 6.09ms
900 236761.23 236761.23 9.39ms
1200 225418.93 225418.93 13.70ms
oltp_update_non_index:
thread tps qps latency
150 48009.90 48009.90 4.82ms
300 68367.73 68367.73 7.43ms
600 73982.72 73982.72 15.55ms
900 74205.04 74205.04 23.52ms
1200 71690.86 71690.86 32.53ms
oltp_update_index:
thread tps qps latency
150 32225.67 32225.67 7.43ms
300 37781.03 37781.03 13.22ms
600 42573.74 42573.74 23.52ms
900 44946.05 44946.05 34.33ms
1200 45486.86 45486.86 44.98ms
oltp_read_write:
thread tps qps latency
150 3281.87 65637.41 66.84ms
300 3278.66 65573.10 137.35ms
600 3249.92 64998.44 277.21ms
900 3227.92 64558.38 419.45ms
1200 3218.62 64372.48 549.52ms
在 tidb 绑定 numa 后,对于原来 tidb 瓶颈的脚本类型,展示出巨大的提升:
脚本 | 提升百分比 |
---|---|
oltp_point_select | 43% |
oltp_update_non_index | 44% |
oltp_update_index | 26% |
oltp_read_write | 36% |
测试 oltp_insert
一个 tidb server:
oltp_insert:
thread tps qps latency
150 45689.78 45689.78 7.17ms
300 53208.06 53208.06 13.70ms
600 58809.44 58809.44 21.50ms
900 61785.02 61785.02 28.16ms
1200 61982.29 61982.29 33.72ms
通过 haproxy 单台机器两个 tidb server:
oltp_insert:
thread tps qps latency
150 54654.82 54654.82 4.33ms
300 75176.21 75176.21 7.98ms
600 87935.19 87935.19 15.27ms
900 91368.11 91368.11 20.00ms
1200 93228.34 93228.34 24.38ms
通过 haproxy 两台机器三个 tidb server:
oltp_insert:
thread tps qps latency
150 55096.47 55096.47 4.25ms
300 77900.68 77900.68 6.55ms
600 94743.69 94743.69 12.52ms
900 98213.51 98213.51 17.32ms
1200 100102.81 100102.81 21.11ms
1500 99329.58 99329.58 24.38ms