0
0
1
0
专栏/.../

TiSpark v2.4.x 升级到 TiSpark v2.5.x

 边城元元  发表于  2022-05-30

一、背景

在安装 TiDB v6.0的时候,使用 Tiup 扩容的方式安装TiSpark集群,最高的版本是 TiSpark v2.4.1,没有最新的 Release TiSpark v2.5.1 。另外,TiSpark v2.5.0 及以上版本实现了部分鉴权与授权功能。

本次主要是体验

  • TiSpark v2.4.1 升级到 TiSpark v2.5.1
  • 体验 TiSpark v2.5.1 的鉴权和授权功能

二、准备环境

2.1 安装 Cluster111 (V6.0.0)

2.1.1 Cluster111 拓扑
# cluster111.yml
server_configs:
  tidb:
    log.slow-threshold: 300
    binlog.enable: false
    binlog.ignore-error: false
  tikv:
    readpool.storage.use-unified-pool: false
    readpool.coprocessor.use-unified-pool: true
  pd:
    schedule.leader-schedule-limit: 4
    schedule.region-schedule-limit: 2048
    schedule.replica-schedule-limit: 64
    replication.location-labels:
      - host
​
pd_servers:
  - host: 10.0.2.15
    # ssh_port: 22
    # name: "pd-1"
    client_port: 2379
    # peer_port: 2380
​
​
tidb_servers:
  - host: 10.0.2.15
​
tikv_servers:
  - host: 10.0.2.15
    # ssh_port: 22
    port: 20160
    status_port: 20180
    config:
      server.grpc-concurrency: 4
monitoring_servers:
  - host: 10.0.2.15
​
grafana_servers:
  - host: 10.0.2.15
​
alertmanager_servers:
  - host: 10.0.2.15
2.1.2 安装 Cluster111
# 安装tiup
curl --proto '=https' --tlsv1.2 -sSf https://tiup-mirrors.pingcap.com/install.sh | sh
source /root/.bash_profile
​
tiup update cluster
tiup cluster list
​
# 检测环境配置并尝试修正
tiup cluster check ./cluster111.yml --user root -p --apply
# 安装cluster111
tiup cluster deploy cluster111 v6.0.0 ./cluster111.yml --user root -p
# 启动集群
tiup cluster start cluster111
tiup cluster display cluster111
​

2.2 TiSpark v2.4.1

2.2.1 拓扑
# cluster111-v6.0.0-tispark.yml
tispark_masters:
  - host: 10.0.2.15
    ssh_port: 22
    port: 7077
# NOTE: multiple worker nodes on the same host is not supported by Spark
tispark_workers:
  - host: 10.0.2.15
2.2.2 安装 TiSpark
  1. 安装openjdk8 (略)
  2. 扩容的方式安装 TiSpark
tiup cluster scale-out cluster111 ./cluster111-v6.0.0-tispark.yml -uroot -p

image.png

2.3 测试 Spark v2.4.3 Standalone

  • spark-defaults.conf 中增加配置
# sql扩展类
spark.sql.extensions org.apache.spark.sql.TiExtensions
# master节点
spark.master spark://10.0.2.15:7077
# pd节点 多个pd用逗号隔开 如:10.16.20.1:2379,10.16.20.2:2379,10.16.20.3:2379
spark.tispark.pd.addresses 10.0.2.15:2379
  • 启动 Spark 集群

/tidb-deploy/tispark-master-7077/sbin/start-all.sh

  • 启动Spark-shell
# 启动 spark-shell
/tidb-deploy/tispark-master-7077/bin/spark-shell
​
# 执行 spark.sql("select ti_version()").collect

image.png

  • 启动 Spark-sql
# 启动 Spark-sql
/tidb-deploy/tispark-master-7077/bin/spark-sql
# 执行 select ti_version();

image.png

三、升级 TiSpark

3.1 下载升级软件

# 下载 Spark V3.1.3
curl -L "https://dlcdn.apache.org/spark/spark-3.1.3/spark-3.1.3-bin-hadoop3.2.tgz" -O spark-3.1.3-bin-hadoop3.2.tgz
# 下载 TiSpark V2.5.1
curl -L "https://github.com/pingcap/tispark/releases/download/v2.5.1/tispark-assembly-3.1-2.5.1.jar" -O tispark-assembly-3.1-2.5.1.jar
​

3.2 备份

\cp -rf /tidb-deploy/tispark-master-7077 /tidb-deploy/tispark-master-7077-bak2.4.1

3.3 升级

# 替换 Spark
mkdir -p /usr/local0/webserver/tispark && tar -zxvf spark-3.1.3-bin-hadoop3.2.tgz -C /usr/local0/webserver/tispark/
mv /usr/local0/webserver/tispark/spark-3.1.3-bin-hadoop3.2 /tidb-deploy/tispark-master-7077
chown tidb.tidb -R /tidb-deploy/tispark-master-7077
# 替换 TiSpark 包
cp -rf tispark-assembly-3.1-2.5.1.jar /tidb-deploy/tispark-master-7077/jars/
# 配置文件
cp -rf /tidb-deploy/tispark-master-7077-bak2.4.1/conf/* /tidb-deploy/tispark-master-7077/conf/

3.4 测试

  • 启动 Spark 集群

/tidb-deploy/tispark-master-7077/sbin/start-all.sh

  • 启动Spark-shell
# 启动 spark-shell
/tidb-deploy/tispark-master-7077/bin/spark-shell
​
# 执行 spark.sql("select ti_version()").collect

image.png

  • 启动 Spark-sql
# 启动 Spark-sql
/tidb-deploy/tispark-master-7077/bin/spark-sql
# 执行 select ti_version();

image.png

四、测试 TiSpark v2.5.1 鉴权

参考:https://github.com/pingcap/tispark/blob/master/docs/authorization_userguide.md

Authorization and authentication through TiDB server

  • The database's user account must have the PROCESS privilege.
  • TiSpark version >= 2.5.0
  • Spark version = 3.0.x or 3.1.x

4.1 增加配置 spark-defaults.conf

spark.sql.tidb.addr    10.0.2.15
spark.sql.tidb.port    4000
spark.sql.tidb.user    root
spark.sql.tidb.password abc
​
# Must config in conf file
spark.sql.auth.enable   true
# in seconds. Values range from 5 to 3600
spark.sql.tidb.auth.refreshInterval 30

4.2 配置错误密码

#这里是错误的密码
spark.sql.tidb.password abc

启动 spark-sql 后使用 执行 sql 语句将报错

image.png

4.3 修正密码

# 空密码
spark.sql.tidb.password 
​
# 开启下面的 30s 将刷新一下(仅对新连接的spark-sql 使用新配置的 spark.sql.tidb.password)
spark.sql.tidb.auth.refreshInterval 30

启动 spark-sql

/tidb-deploy/tispark-master-7077/bin/spark-sql
use tidb_catalog;
show databases;

image.png

select 'CUSTOMER' tablename , count(*) ct from tidb_catalog.TPCH_001.CUSTOMER union all
select 'NATION' tablename , count(*) ct from tidb_catalog.TPCH_001.NATION union all
select 'REGION' tablename , count(*) ct from tidb_catalog.TPCH_001.REGION union all
select 'PART' tablename , count(*) ct from tidb_catalog.TPCH_001.PART union all
select 'SUPPLIER' tablename , count(*) ct from tidb_catalog.TPCH_001.SUPPLIER union all
select 'PARTSUPP' tablename , count(*) ct from tidb_catalog.TPCH_001.PARTSUPP union all
select 'ORDERS' tablename , count(*) ct from tidb_catalog.TPCH_001.ORDERS union all
select 'LINEITEM' tablename , count(*) ct from tidb_catalog.TPCH_001.LINEITEM  order by ct desc;

image.png

4.4 SparkSession 中配置密码

spark.sqlContext.setConf("spark.sql.tidb.addr", your_tidb_server_address)
spark.sqlContext.setConf("spark.sql.tidb.port", your_tidb_server_port)
spark.sqlContext.setConf("spark.sql.tidb.user", your_tidb_server_user)
spark.sqlContext.setConf("spark.sql.tidb.password", your_tidb_server_password)

4.5 限制

  • 不能与 TiDB 以外的其他数据源一起工作
  • 不支持基于角色的权限
  • TiDB Data Source API 不支持,例如 TiBatchWrite

五、总结

  1. 本篇实践了 tiup list tispark --all 没有 TiSpark v2.5.x的情况下,升级到 TiSpark v2.5.1;
  2. 同时试用了 TiSpark v2.5.x 新支持的鉴权特性。

谢谢!

参考

https://tidb.net/blog/19eeb447#Spark Standalone集群升级步骤 https://tidb.net/blog/b8f902a9#TiSpark 2.4.1(Spark 2.4.5)到TiSpark 2.5.0(Spark 3.0.X/3.1.X)迁移实践 https://github.com/pingcap/tispark/blob/master/docs/authorization_userguide.md

0
0
1
0

版权声明:本文为 TiDB 社区用户原创文章,遵循 CC BY-NC-SA 4.0 版权协议,转载请附上原文出处链接和本声明。

评论
暂无评论