0
0
0
0
专栏/.../

使用br工具备份到local的一些操作

 hellogitee  发表于  2023-09-12

背景

最近业务有一个需求,为防止机房级别的故障,想要在异地机房新搭建一套TiDB集群做备用,以便能随时进行机房级别的切换。这种需求当然是要用TiCDC来同步啦,第一要步就是通过br工具进行备份,然后再来同步。

官方文档&FAQ

备份存储的选择

官方文档建议使用S3或者NFS,如果使用local的话,因为br备份是将tikv的各个节点数据保存到本地目录,在恢复的时候需要将所有的tikv节点备份数据合并到一起后才能使用,这样比较麻烦,被官方不推荐使用。

但咱不是没那条件么,合并麻烦是麻烦,但总归是条路子。

https://docs.pingcap.com/zh/tidb/dev/br-use-overview#如何管理备份数据

备份用户的权限和注意项

看FAQ,要求备份的目录要具有读写权限,如果 br 工具和 TiKV 位于不同的机器,则需要用户的 UID 相同。

权限可以理解,但为啥uid也要完全一致?

https://docs.pingcap.com/zh/tidb/stable/backup-and-restore-faq#遇到-permission-denied-或者-no-such-file-or-directory-错误即使用-root-运行-br-命令行工具也无法解决该如何处理

以下为具体测试步骤。

实验步骤

环境准备

使用三台测试机

dbpnew129v    10.10.10.1
dbpnew130v    10.10.10.2
dbpnew131v    10.10.10.3

查看三台备份用户的uid(为啥用kibana用户,因为我也在测试es。。)

[kibana@dbpnew129v backup]$ id
uid=49480(kibana) gid=49479(kibana) groups=49479(kibana)

[kibana@dbpnew130v ~]$ id 
uid=49479(kibana) gid=49479(kibana) groups=49479(kibana)

[kibana@dbpnew131v ~]$ id
uid=49478(kibana) gid=49479(kibana) groups=49479(kibana)

测试tidb版本


[kibana@dbpnew129v backup]$ tiup cluster display test2
tiup is checking updates for component cluster ...
Starting component `cluster`: /home/kibana/.tiup/components/cluster/v1.13.0/tiup-cluster display test2
Cluster type:       tidb
Cluster name:       test2
Cluster version:    v6.5.2
Deploy user:        kibana
SSH type:           builtin
Dashboard URL:      http://10.10.10.1:2379/dashboard
Grafana URL:        http://10.10.10.1:3000
ID                   Role          Host           Ports        OS/Arch       Status   Data Dir                            Deploy Dir
--                   ----          ----           -----        -------       ------   --------                            ----------
10.10.10.1:9093    alertmanager  10.10.10.1   9093/9094    linux/x86_64  Up       /data1/tidb-data/alertmanager-9093  /data1/tidb-deploy/alertmanager-9093
10.10.10.1:3000    grafana       10.10.10.1   3000         linux/x86_64  Up       -                                   /data1/tidb-deploy/grafana-3000
10.10.10.2:2379   pd            10.10.10.2  2379/2380    linux/x86_64  Up       /data1/tidb-data/pd-2379            /data1/tidb-deploy/pd-2379
10.10.10.1:2379    pd            10.10.10.1   2379/2380    linux/x86_64  Up|L|UI  /data1/tidb-data/pd-2379            /data1/tidb-deploy/pd-2379
10.10.10.3:2379    pd            10.10.10.3   2379/2380    linux/x86_64  Up       /data1/tidb-data/pd-2379            /data1/tidb-deploy/pd-2379
10.10.10.1:9090    prometheus    10.10.10.1   9090/12020   linux/x86_64  Up       /data1/tidb-data/prometheus-9090    /data1/tidb-deploy/prometheus-9090
10.10.10.2:4000   tidb          10.10.10.2  4000/10080   linux/x86_64  Up       -                                   /data1/tidb-deploy/tidb-4000
10.10.10.1:4000    tidb          10.10.10.1   4000/10080   linux/x86_64  Up       -                                   /data1/tidb-deploy/tidb-4000
10.10.10.3:4000    tidb          10.10.10.3   4000/10080   linux/x86_64  Up       -                                   /data1/tidb-deploy/tidb-4000
10.10.10.2:20160  tikv          10.10.10.2  20160/20180  linux/x86_64  Up       /data1/tidb-data/tikv-20160         /data1/tidb-deploy/tikv-20160
10.10.10.1:20160   tikv          10.10.10.1   20160/20180  linux/x86_64  Up       /data1/tidb-data/tikv-20160         /data1/tidb-deploy/tikv-20160
10.10.10.3:20160   tikv          10.10.10.3   20160/20180  linux/x86_64  Up       /data1/tidb-data/tikv-20160         /data1/tidb-deploy/tikv-20160

开始备份

[kibana@dbpnew129v data1]$ tiup br backup full --pd 10.10.10.2:2379 --storage "local:///data1/backup"

因为/data1是777权限,而指定的/data1/backup子目录并没有提前创建,于是备份吐出一大堆的错误信息,感受到了满屏的伤害。。。

## 截取部分日志
[2023/09/11 10:55:24.686 +08:00] [INFO] [collector.go:77] ["Full Backup failed summary"] [total-ranges=80] [ranges-succeed=0] [ranges-failed=80] [backup-total-ranges=80] [backup-total-regions=82] [unit-name="range start:7480000000000000485f720000000000000000 end:7480000000000000485f72ffffffffffffffff00"] [error="rpc error: code = Canceled desc = context canceled"] [errorVerbose="rpc error: code = Canceled desc = context canceled\ngithub.com/tikv/pd/client.(*client).respForErr\n\t/go/pkg/mod/github.com/tikv/pd/client@v0.0.0-20230724080549-de985b8e0afc/client.go:1582\ngithub.com/tikv/pd/client.(*client).GetAllStores\n\t/go/pkg/mod/github.com/tikv/pd/client@v0.0.0-20230724080549-de985b8e0afc/client.go:1189\ngithub.com/pingcap/tidb/br/pkg/conn/util.GetAllTiKVStores\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/conn/util/util.go:39\ngithub.com/pingcap/tidb/br/pkg/conn.GetAllTiKVStoresWithRetry.func1\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/conn/conn.go:83\ngithub.com/pingcap/tidb/br/pkg/utils.WithRetry\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/utils/retry.go:56\ngithub.com/pingcap/tidb/br/pkg/conn.GetAllTiKVStoresWithRetry\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/conn/conn.go:80\ngithub.com/pingcap/tidb/br/pkg/backup.(*Client).BackupRange\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/backup/client.go:893\ngithub.com/pingcap/tidb/br/pkg/backup.(*Client).BackupRanges.func2\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/backup/client.go:852\ngithub.com/pingcap/tidb/br/pkg/utils.(*WorkerPool).ApplyOnErrorGroup.func1\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/utils/worker.go:76\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\t/go/pkg/mod/golang.org/x/sync@v0.3.0/errgroup/errgroup.go:75\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1598"] [unit-name="range start:7480000000000000185f720000000000000000 end:7480000000000000185f72ffffffffffffffff00"] [error="rpc error: code = Canceled desc = context canceled"] [errorVerbose="rpc error: code = Canceled desc = context canceled\ngithub.com/tikv/pd/client.(*client).respForErr\n\t/go/pkg/mod/github.com/tikv/pd/client@v0.0.0-20230724080549-de985b8e0afc/client.go:1582\ngithub.com/tikv/pd/client.(*client).GetAllStores\n\t/go/pkg/mod/github.com/tikv/pd/client@v0.0.0-20230724080549-de985b8e0afc/client.go:1189\ngithub.com/pingcap/tidb/br/pkg/conn/util.GetAllTiKVStores\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/conn/util/util.go:39\ngithub.com/pingcap/tidb/br/pkg/conn.GetAllTiKVStoresWithRetry.func1\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/conn/conn.go:83\ngithub.com/pingcap/tidb/br/pkg/utils.WithRetry\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/utils/retry.go:56\ngithub.com/pingcap/tidb/br/pkg/conn.GetAllTiKVStoresWithRetry\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/conn/conn.go:80\ngithub.com/pingcap/tidb/br/pkg/backup.(*Client).BackupRange\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/backup/client.go:893\ngithub.com/pingcap/tidb/br/pkg/backup.(*Client).BackupRanges.func2\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/backup/client.go:852\ngithub.com/pingcap/tidb/br/pkg/utils.(*WorkerPool).ApplyOnErrorGroup.func1\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/utils/worker.go:76\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\t/go/pkg/mod/golang.org/x/sync@v0.3.0/errgroup/errgroup.go:75\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1598"] [unit-name="range start:748000fffffffffffd5f720000000000000000 end:748000fffffffffffd5f72ffffffffffffffff00"] [error="rpc error: code = Canceled desc = context canceled"] [errorVerbose="rpc error: code = Canceled desc = context canceled\ngithub.com/tikv/pd/client.(*client).respForErr\n\t/go/pkg/mod/github.com/tikv/pd/client@v0.0.0-20230724080549-de985b8e0afc/client.go:1582\ngithub.com/tikv/pd/client.(*client).GetAllStores\n\t/go/pkg/mod/github.com/tikv/pd/client@v0.0.0-20230724080549-de985b8e0afc/client.go:1189\ngithub.com/pingcap/tidb/br/pkg/conn/util.GetAllTiKVStores\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/conn/util/util.go:39\ngithub.com/pingcap/tidb/br/pkg/conn.GetAllTiKVStoresWithRetry.func1\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/conn/conn.go:83\ngithub.com/pingcap/tidb/br/pkg/utils.WithRetry\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/utils/retry.go:56\ngithub.com/pingcap/tidb/br/pkg/conn.GetAllTiKVStoresWithRetry\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/conn/conn.go:80\ngithub.com/pingcap/tidb/br/pkg/backup.(*Client).BackupRange\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/backup/client.go:893\ngithub.com/pingcap/tidb/br/pkg/backup.(*Client).BackupRanges.func2\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/backup/client.go:852\ngithub.com/pingcap/tidb/br/pkg/utils.(*WorkerPool).ApplyOnErrorGroup.func1\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/utils/worker.go:76\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\t/go/pkg/mod/golang.org/x/sync@v0.3.0/errgroup/errgroup.go:75\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1598"] [unit-name="range start:7480000000000000205f720000000000000000 end:7480000000000000205f72ffffffffffffffff00"] [error="rpc error: code = Canceled desc = context canceled"] [errorVerbose="rpc error: code = Canceled desc = context canceled\ngithub.com/tikv/pd/client.(*client).respForErr\n\t/go/pkg/mod/github.com/tikv/pd/client@v0.0.0-20230724080549-de985b8e0afc/client.go:1582\ngithub.com/tikv/pd/client.(*client).GetAllStores\n\t/go/pkg/mod/github.com/tikv/pd/client@v0.0.0-20230724080549-de985b8e0afc/client.go:1189\ngithub.com/pingcap/tidb/br/pkg/conn/util.GetAllTiKVStores\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/conn/util/util.go:39\ngithub.com/pingcap/tidb/br/pkg/conn.GetAllTiKVStoresWithRetry.func1\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/conn/conn.go:83\ngithub.com/pingcap/tidb/br/pkg/utils.WithRetry\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/utils/retry.go:56\ngithub.com/pingcap/tidb/br/pkg/conn.GetAllTiKVStoresWithRetry\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/conn/conn.go:80\ngithub.com/pingcap/tidb/br/pkg/backup.(*Client).BackupRange\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/backup/client.go:893\ngithub.com/pingcap/tidb/br/pkg/backup.(*Client).BackupRanges.func2\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/backup/client.go:852\ngithub.com/pingcap/tidb/br/pkg/utils.(*WorkerPool).ApplyOnErrorGroup.func1\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/utils/worker.go:76\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\t/go/pkg/mod/golang.org/x/sync@v0.3.0/errgroup/errgroup.go:75\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1598"] [unit-name="range start:74800000000000002e5f69800000000000000300 end:74800000000000002e5f698000000000000003fb"] [error="rpc error: code = Canceled desc = context canceled"] [errorVerbose="rpc error: code = Canceled desc = context canceled\ngithub.com/tikv/pd/client.(*client).respForErr\n\t/go/pkg/mod/github.com/tikv/pd/client@v0.0.0-20230724080549-de985b8e0afc/client.go:1582\ngithub.com/tikv/pd/client.(*client).GetAllStores\n\t/go/pkg/mod/github.com/tikv/pd/client@v0.0.0-20230724080549-de985b8e0afc/client.go:1189\ngithub.com/pingcap/tidb/br/pkg/conn/util.GetAllTiKVStores\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/conn/util/util.go:39\ngithub.com/pingcap/tidb/br/pkg/conn.GetAllTiKVStoresWithRetry.func1\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/conn/conn.go:83\ngithub.com/pingcap/tidb/br/pkg/utils.WithRetry\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/utils/retry.go:56\ngithub.com/pingcap/tidb/br/pkg/conn.GetAllTiKVStoresWithRetry\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/conn/conn.go:80\ngithub.com/pingcap/tidb/br/pkg/backup.(*Client).BackupRange\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/backup/client.go:893\ngithub.com/pingcap/tidb/br/pkg/backup.(*Client).BackupRanges.func2\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/backup/client.go:852\ngithub.com/pingcap/tidb/br/pkg/utils.(*WorkerPool).ApplyOnErrorGroup.func1\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/utils/worker.go:76\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\t/go/pkg/mod/golang.org/x/sync@v0.3.0/errgroup/errgroup.go:75\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1598"] [unit-name="range start:7480000000000000345f720000000000000000 end:7480000000000000345f72ffffffffffffffff00"] [error="rpc error: code = Canceled desc = context canceled"] [errorVerbose="rpc error: code = Canceled desc = context canceled\ngithub.com/tikv/pd/client.(*client).respForErr\n\t/go/pkg/mod/github.com/tikv/pd/client@v0.0.0-20230724080549-de985b8e0afc/client.go:1582\ngithub.com/tikv/pd/client.(*client).GetAllStores\n\t/go/pkg/mod/github.com/tikv/pd/client@v0.0.0-20230724080549-de985b8e0afc/client.go:1189\ngithub.com/pingcap/tidb/br/pkg/conn/util.GetAllTiKVStores\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/conn/util/util.go:39\ngithub.com/pingcap/tidb/br/pkg/conn.GetAllTiKVStoresWithRetry.func1\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/conn/conn.go:83\ngithub.com/pingcap/tidb/br/pkg/utils.WithRetry\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/utils/retry.go:56\ngithub.com/pingcap/tidb/br/pkg/conn.GetAllTiKVStoresWithRetry\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/conn/conn.go:80\ngithub.com/pingcap/tidb/br/pkg/backup.(*Client).BackupRange\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/backup/client.go:893\ngithub.com/pingcap/tidb/br/pkg/backup.(*Client).BackupRanges.func2\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/backup/client.go:852\ngithub.com/pingcap/tidb/br/pkg/utils.(*WorkerPool).ApplyOnErrorGroup.func1\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/utils/worker.go:76\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\t/go/pkg/mod/golang.org/x/sync@v0.3.0/errgroup/errgroup.go:75\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1598"] [unit-name="range start:7480000000000000365f720000000000000000 end:7480000000000000365f72ffffffffffffffff00"] [error="rpc error: code = Canceled desc = context canceled"] [errorVerbose="rpc error: code = Canceled desc = context canceled\ngithub.com/tikv/pd/client.(*client).respForErr\n\t/go/pkg/mod/github.com/tikv/pd/client@v0.0.0-20230724080549-de985b8e0afc/client.go:1582\ngithub.com/tikv/pd/client.(*client).GetAllStores\n\t/go/pkg/mod/github.com/tikv/pd/client@v0.0.0-20230724080549-de985b8e0afc/client.go:1189\ngithub.com/pingcap/tidb/br/pkg/conn/util.GetAllTiKVStores\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/conn/util/util.go:39\ngithub.com/pingcap/tidb/br/pkg/conn.GetAllTiKVStoresWithRetry.func1\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/conn/conn.go:83\ngithub.com/pingcap/tidb/br/pkg/utils.WithRetry\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/utils/retry.go:56\ngithub.com/pingcap/tidb/br/pkg/conn.GetAllTiKVStoresWithRetry\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/conn/conn.go:80\ngithub.com/pingcap/tidb/br/pkg/backup.(*Client).BackupRange\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/backup/client.go:893\ngithub.com/pingcap/tidb/br/pkg/backup.(*Client).BackupRanges.func2\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/backup/client.go:852\ngithub.com/pingcap/tidb/br/pkg/utils.(*WorkerPool).ApplyOnErrorGroup.func1\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/utils/worker.go:76\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\t/go/pkg/mod/golang.org/x/sync@v0.3.0/errgroup/errgroup.go:75\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1598"] [unit-name="range start:7480000000000000105f69800000000000000100 end:7480000000000000105f698000000000000001fb"] [error="rpc error: code = Canceled desc = context canceled"] [errorVerbose="rpc error: code = Canceled desc = context canceled\ngithub.com/tikv/pd/client.(*client).respForErr\n\t/go/pkg/mod/github.com/tikv/pd/client@v0.0.0-20230724080549-de985b8e0afc/client.go:1582\ngithub.com/tikv/pd/client.(*client).GetAllStores\n\t/go/pkg/mod/github.com/tikv/pd/client@v0.0.0-20230724080549-de985b8e0afc/client.go:1189\ngithub.com/pingcap/tidb/br/pkg/conn/util.GetAllTiKVStores\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/conn/util/util.go:39\ngithub.com/pingcap/tidb/br/pkg/conn.GetAllTiKVStoresWithRetry.func1\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/conn/conn.go:83\ngithub.com/pingcap/tidb/br/pkg/utils.WithRetry\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/utils/retry.go:56\ngithub.com/pingcap/tidb/br/pkg/conn.GetAllTiKVStoresWithRetry\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/conn/conn.go:80\ngithub.com/pingcap/tidb/br/pkg/backup.(*Client).BackupRange\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/backup/client.go:893\ngithub.com/pingcap/tidb/br/pkg/backup.(*Client).BackupRanges.func2\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/backup/client.go:852\ngithub.com/pingcap/tidb/br/pkg/utils.(*WorkerPool).ApplyOnErrorGroup.func1\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/utils/worker.go:76\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\t/go/pkg/mod/golang.org/x/sync@v0.3.0/errgroup/errgroup.go:75\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1598"] [unit-name="range start:7480000000000000165f720000000000000000 end:7480000000000000165f72ffffffffffffffff00"] [error="rpc error: code = Canceled desc = context canceled"] [errorVerbose="rpc error: code = Canceled desc = context canceled\ngithub.com/tikv/pd/client.(*client).respForErr\n\t/go/pkg/mod/github.com/tikv/pd/client@v0.0.0-20230724080549-de985b8e0afc/client.go:1582\ngithub.com/tikv/pd/client.(*client).GetAllStores\n\t/go/pkg/mod/github.com/tikv/pd/client@v0.0.0-20230724080549-de985b8e0afc/client.go:1
Error: error happen in store 5 at 10.10.10.2:20160: File or directory not found on TiKV Node (store id: 5; Address: 10.10.10.2:20160). work around:please ensure br and tikv nodes share a same storage and the user of br and tikv has same uid.: [BR:KV:ErrKVStorage]tikv storage occur I/O error

通过最后一条输出看到提示文件或目录在tikv节点不存在。

再查看/tmp/br下产生的备份日志:

[2023/09/11 10:55:24.680 +08:00] [ERROR] [push.go:206] [range-sn=0] [error="[BR:KV:ErrKVStorage]tikv storage occur I/O error: File or directory not found on TiKV Node (store id: 5; Address: 10.10.10.2:20160). work around:please ensure br and tikv nodes share a same storage and the user of br and tikv has same uid."] [stack="github.com/pingcap/tidb/br/pkg/backup.(*pushDown).pushBackup\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/backup/push.go:206\ngithub.com/pingcap/tidb/br/pkg/backup.(*Client).BackupRange\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/backup/client.go:938\ngithub.com/pingcap/tidb/br/pkg/backup.(*Client).BackupRanges.func2\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/backup/client.go:852\ngithub.com/pingcap/tidb/br/pkg/utils.(*WorkerPool).ApplyOnErrorGroup.func1\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/utils/worker.go:76\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\t/go/pkg/mod/golang.org/x/sync@v0.3.0/errgroup/errgroup.go:75"]

看提示错误是:必须共享相同的存储,且使用br工具备份的用户和运行tikv节点的用户,必须具有相同的uid。

问题解决

看到这种报错的意思,只能搞S3或者NFS共享文件存储了,既然提示没有文件或目录,那我提前创建下呢?

## 三个tikv节点使用br备份用户提前创建/data1/backup目录
[kibana@dbpnew131v data1]$ mkdir /data1/backup

## 再次使用br工具进行备份
[kibana@dbpnew129v data1]$ tiup br backup full --pd 10.10.10.2:2379 --storage "local:///data1/backup" 
tiup is checking updates for component br ...
Starting component `br`: /home/kibana/.tiup/components/br/v7.3.0/br backup full --pd 10.10.10.2:2379 --storage local:///data1/backup
Detail BR log in /tmp/br.log.2023-09-11T11.40.17+0800 
Full Backup <------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------> 100.00%
Checksum <---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------> 100.00%
[2023/09/11 11:40:24.602 +08:00] [INFO] [collector.go:77] ["Full Backup success summary"] [total-ranges=27] [ranges-succeed=27] [ranges-failed=0] [backup-checksum=569.677625ms] [backup-fast-checksum=9.318469ms] [backup-total-ranges=80] [backup-total-regions=82] [total-take=6.64338341s] [total-kv-size=86.32MB] [average-speed=12.99MB/s] [backup-data-size(after-compressed)=5.027MB] [Size=5026872] [BackupTS=444177742312505345] [total-kv=2098905]
[kibana@dbpnew129v data1]$ 

竟然成功了!!

问题总结

  • 在使用br工具做备份时,如果使用local的方式时,不能只确保备份的目录对启动各个tikv节点用户具有读写权限,还要确保备份指定的目录要实际存在(br节点会自己创建一个777的备份目录);
  • 备份日志提示有误导,提示【please ensure br and tikv nodes share a same storage and the user of br and tikv has same uid】与实际表现不对,实际上只是因为备份指定的实际目录没创建而已;
  • 文档FAQ中,对使用本地磁盘备份要求【如果 br 工具和 TiKV 位于不同的机器,则需要用户的 UID 相同】,这一点并不是必须的,因为实际我uid不同也是能正常备份的;
  • 后面测试即使启动tikv的用户和备份的br用户不通,只要保证目录存在且具有读写权限,也是能正常备份成功的;

一句话,保证备份命令中指定的目录实际存在并使其对tikv具有读写权限(不能只确保父目录,因为备份时并不会实际替咱们创建),不管你用啥用户,不管uid是否一致,都能备份成功!!

0
0
0
0

版权声明:本文为 TiDB 社区用户原创文章,遵循 CC BY-NC-SA 4.0 版权协议,转载请附上原文出处链接和本声明。

评论
暂无评论