TBase源码编译安装
- 创建tbase用户
注意:所有需要安装TBase集群的机器上都需要创建
mkdir /data
useradd -d /data/tbase tbase
- 源码获取
git clone https://github.com/Tencent/TBase
- 源码编译
cd ${SOURCECODE_PATH}
rm -rf ${INSTALL_PATH}/tbase_bin_v2.0
chmod +x configure*
./configure --prefix=${INSTALL_PATH}/tbase_bin_v2.0 --enable-user-switch --with-openssl --with-ossp-uuid CFLAGS=-g
make clean
make -sj
make install
chmod +x contrib/pgxc_ctl/make_signature
cd contrib
make -sj
make install
本文的使用环境中,上述两个参数如下
${SOURCECODE_PATH}=/data/tbase/TBase-master
${INSTALL_PATH}=/data/tbase/install
集群安装
- 集群规划下面以两台服务器上搭建1GTM主,1GTM备,2CN主(CN主之间对等,因此无需备CN),2DN主,2DN备的集群,该集群为具备容灾能力的最小配置
机器1:10.215.147.158
机器2:10.240.138.159
集群规划如下:
节点名称IP数据目录GTM master10.215.147.158/data/tbase/data/gtmGTM slave10.240.138.159/data/tbase/data/gtmCN110.215.147.158/data/tbase/data/coordCN210.240.138.159/data/tbase/data/coordDN1 master10.215.147.158/data/tbase/data/dn001DN1 slave10.240.138.159/data/tbase/data/dn001DN2 master10.240.138.159/data/tbase/data/dn002DN2 slave10.215.147.158/data/tbase/data/dn002
示意图
机器间的ssh互信配置参考Linux ssh互信配置
环境变量配置集群所有机器都需要配置
- [tbase@TENCENT64 ~]$ vim ~/.bashrc
- export TBASE_HOME=/data/tbase/install/tbase_bin_v2.0
- export PATH=$TBASE_HOME/bin:$PATH
- export LD_LIBRARY_PATH=$TBASE_HOME/lib:${LD_LIBRARY_PATH}
以上,已经配置好了所需要基础环境,可以进入到集群初始化阶段,为了方便用户,TBase提供了专用的配置和操作工具:pgxc_ctl来协助用户快速搭建并管理集群,首先需要将前文所述的节点的ip,端口,目录写入到配置文件 pgxc_ctl.conf 中。
- 初始化pgxc_ctl.conf文件
- [tbase@TENCENT64 ~]$ mkdir /data/tbase/pgxc_ctl
- [tbase@TENCENT64 ~]$ cd /data/tbase/pgxc_ctl
- [tbase@TENCENT64 ~/pgxc_ctl]$ vim pgxc_ctl.conf
如下,是结合上文描述的IP,端口,数据库目录,二进制目录等规划来写的pgxc_ctl.conf文件。具体实践中只需按照自己的实际情况配置好即可.
- #!/bin/bash
- pgxcInstallDir=/data/tbase/install/tbase_bin_v2.0
- pgxcOwner=tbase
- defaultDatabase=postgres
- pgxcUser=$pgxcOwner
- tmpDir=/tmp
- localTmpDir=$tmpDir
- configBackup=n
- configBackupHost=pgxc-linker
- configBackupDir=$HOME/pgxc
- configBackupFile=pgxc_ctl.bak
- #---- GTM ----------
- gtmName=gtm
- gtmMasterServer=10.215.147.158
- gtmMasterPort=50001
- gtmMasterDir=/data/tbase/data/gtm
- gtmExtraConfig=none
- gtmMasterSpecificExtraConfig=none
- gtmSlave=y
- gtmSlaveServer=10.240.138.159
- gtmSlavePort=50001
- gtmSlaveDir=/data/tbase/data/gtm
- gtmSlaveSpecificExtraConfig=none
- #---- Coordinators -------
- coordMasterDir=/data/tbase/data/coord
- coordMasterDir=/data/tbase/data/coord
- coordArchLogDir=/data/tbase/data/coord_archlog
- coordNames=(cn001 cn002 )
- coordPorts=(30004 30004 )
- poolerPorts=(31110 31110 )
- coordPgHbaEntries=(0.0.0.0/0)
- coordMasterServers=(10.215.147.158 10.240.138.159)
- coordMasterDirs=($coordMasterDir $coordMasterDir)
- coordMaxWALsernder=2
- coordMaxWALSenders=($coordMaxWALsernder $coordMaxWALsernder )
- coordSlave=n
- coordSlaveSync=n
- coordArchLogDirs=($coordArchLogDir $coordArchLogDir)
- coordExtraConfig=coordExtraConfig
- cat > $coordExtraConfig <<EOF
- #================================================
- # Added to all the coordinator postgresql.conf
- # Original: $coordExtraConfig
- include_if_exists = '/data/tbase/global/global_tbase.conf'
- wal_level = replica
- wal_keep_segments = 256
- max_wal_senders = 4
- archive_mode = on
- archive_timeout = 1800
- archive_command = 'echo 0'
- log_truncate_on_rotation = on
- log_filename = 'postgresql-%M.log'
- log_rotation_age = 4h
- log_rotation_size = 100MB
- hot_standby = on
- wal_sender_timeout = 30min
- wal_receiver_timeout = 30min
- shared_buffers = 1024MB
- max_pool_size = 2000
- log_statement = 'ddl'
- log_destination = 'csvlog'
- logging_collector = on
- log_directory = 'pg_log'
- listen_addresses = '*'
- max_connections = 2000
- EOF
- coordSpecificExtraConfig=(none none)
- coordExtraPgHba=coordExtraPgHba
- cat > $coordExtraPgHba <<EOF
- local all all trust
- host all all 0.0.0.0/0 trust
- host replication all 0.0.0.0/0 trust
- host all all ::1/128 trust
- host replication all ::1/128 trust
- EOF
- coordSpecificExtraPgHba=(none none)
- coordAdditionalSlaves=n
- cad1_Sync=n
- #---- Datanodes ---------------------
- dn1MstrDir=/data/tbase/data/dn001
- dn2MstrDir=/data/tbase/data/dn002
- dn1SlvDir=/data/tbase/data/dn001
- dn2SlvDir=/data/tbase/data/dn002
- dn1ALDir=/data/tbase/data/datanode_archlog
- dn2ALDir=/data/tbase/data/datanode_archlog
- primaryDatanode=dn001
- datanodeNames=(dn001 dn002)
- datanodePorts=(40004 40004)
- datanodePoolerPorts=(41110 41110)
- datanodePgHbaEntries=(0.0.0.0/0)
- datanodeMasterServers=(10.215.147.158 10.240.138.159)
- datanodeMasterDirs=($dn1MstrDir $dn2MstrDir)
- dnWALSndr=4
- datanodeMaxWALSenders=($dnWALSndr $dnWALSndr)
- datanodeSlave=y
- datanodeSlaveServers=(10.240.138.159 10.215.147.158)
- datanodeSlavePorts=(50004 54004)
- datanodeSlavePoolerPorts=(51110 51110)
- datanodeSlaveSync=n
- datanodeSlaveDirs=($dn1SlvDir $dn2SlvDir)
- datanodeArchLogDirs=($dn1ALDir/dn001 $dn2ALDir/dn002)
- datanodeExtraConfig=datanodeExtraConfig
- cat > $datanodeExtraConfig <<EOF
- #================================================
- # Added to all the coordinator postgresql.conf
- # Original: $datanodeExtraConfig
- include_if_exists = '/data/tbase/global/global_tbase.conf'
- listen_addresses = '*'
- wal_level = replica
- wal_keep_segments = 256
- max_wal_senders = 4
- archive_mode = on
- archive_timeout = 1800
- archive_command = 'echo 0'
- log_directory = 'pg_log'
- logging_collector = on
- log_truncate_on_rotation = on
- log_filename = 'postgresql-%M.log'
- log_rotation_age = 4h
- log_rotation_size = 100MB
- hot_standby = on
- wal_sender_timeout = 30min
- wal_receiver_timeout = 30min
- shared_buffers = 1024MB
- max_connections = 4000
- max_pool_size = 4000
- log_statement = 'ddl'
- log_destination = 'csvlog'
- wal_buffers = 1GB
- EOF
- datanodeSpecificExtraConfig=(none none)
- datanodeExtraPgHba=datanodeExtraPgHba
- cat > $datanodeExtraPgHba <<EOF
- local all all trust
- host all all 0.0.0.0/0 trust
- host replication all 0.0.0.0/0 trust
- host all all ::1/128 trust
- host replication all ::1/128 trust
- EOF
- datanodeSpecificExtraPgHba=(none none)
- datanodeAdditionalSlaves=n
- walArchive=n
- 分发二进制包在一个节点配置好配置文件后,需要预先将二进制包部署到所有节点所在的机器上,这个可以使用pgxc_ctl工具,执行deploy all命令来完成。
- [tbase@TENCENT64 ~/pgxc_ctl]$ pgxc_ctl
- /usr/bin/bash
- Installing pgxc_ctl_bash script as /data/tbase/pgxc_ctl/pgxc_ctl_bash.
- Installing pgxc_ctl_bash script as /data/tbase/pgxc_ctl/pgxc_ctl_bash.
- Reading configuration using /data/tbase/pgxc_ctl/pgxc_ctl_bash --home /data/tbase/pgxc_ctl --configuration /data/tbase/pgxc_ctl/pgxc_ctl.conf
- Finished reading configuration.
- ******** PGXC_CTL START ***************
- Current directory: /data/tbase/pgxc_ctl
- PGXC deploy all
- Deploying Postgres-XL components to all the target servers.
- Prepare tarball to deploy ...
- Deploying to the server 10.215.147.158.
- Deploying to the server 10.240.138.159.
- Deployment done.
- 登录到所有节点,check二进制包是否分发OK
- [tbase@TENCENT64 ~/install]$ ls /data/tbase/install/tbase_bin_v2.0
- bin include lib share
- 执行init all命令,完成集群初始化命令
- [tbase@TENCENT64 ~]$ pgxc_ctl
- /usr/bin/bash
- Installing pgxc_ctl_bash script as /data/tbase/pgxc_ctl/pgxc_ctl_bash.
- Installing pgxc_ctl_bash script as /data/tbase/pgxc_ctl/pgxc_ctl_bash.
- Reading configuration using /data/tbase/pgxc_ctl/pgxc_ctl_bash --home /data/tbase/pgxc_ctl --configuration /data/tbase/pgxc_ctl/pgxc_ctl.conf
- Finished reading configuration.
- ******** PGXC_CTL START ***************
- Current directory: /data/tbase/pgxc_ctl
- PGXC init all
- Initialize GTM master
- ....
- ....
- Initialize datanode slave dn001
- Initialize datanode slave dn002
- mkdir: cannot create directory '/data1/tbase': Permission denied
- chmod: cannot access '/data1/tbase/data/dn001': No such file or directory
- pg_ctl: directory "/data1/tbase/data/dn001" does not exist
- pg_basebackup: could not create directory "/data1/tbase": Permission denied
- 安装错误处理一般init集群出错,终端会打印出错误日志,通过查看错误原因,更改配置即可,或者可以通过/data/tbase/pgxc_ctl/pgxc_log路径下的错误日志查看错误,排查配置文件的错误
- [tbase@TENCENT64 ~]$ ll ~/pgxc_ctl/pgxc_log/
- total 184
- -rw-rw-r-- 1 tbase tbase 81123 Nov 13 17:22 14105_pgxc_ctl.log
- -rw-rw-r-- 1 tbase tbase 2861 Nov 13 17:58 15762_pgxc_ctl.log
- -rw-rw-r-- 1 tbase tbase 14823 Nov 14 07:59 16671_pgxc_ctl.log
- -rw-rw-r-- 1 tbase tbase 2721 Nov 13 16:52 18891_pgxc_ctl.log
- -rw-rw-r-- 1 tbase tbase 1409 Nov 13 16:20 22603_pgxc_ctl.log
- -rw-rw-r-- 1 tbase tbase 60043 Nov 13 16:33 28932_pgxc_ctl.log
- -rw-rw-r-- 1 tbase tbase 15671 Nov 14 07:57 6849_pgxc_ctl.log
通过运行 pgxc_ctl 工具,执行clean all命令删除已经初始化的文件,修改pgxc_ctl.conf文件,重新执行init all命令重新发起初始化。
- [tbase@TENCENT64 ~]$ pgxc_ctl
- /usr/bin/bash
- Installing pgxc_ctl_bash script as /data/tbase/pgxc_ctl/pgxc_ctl_bash.
- Installing pgxc_ctl_bash script as /data/tbase/pgxc_ctl/pgxc_ctl_bash.
- Reading configuration using /data/tbase/pgxc_ctl/pgxc_ctl_bash --home /data/tbase/pgxc_ctl --configuration /data/tbase/pgxc_ctl/pgxc_ctl.conf
- Finished reading configuration.
- ******** PGXC_CTL START ***************
- Current directory: /data/tbase/pgxc_ctl
- PGXC clean all
- [tbase@TENCENT64 ~]$ pgxc_ctl
- /usr/bin/bash
- Installing pgxc_ctl_bash script as /data/tbase/pgxc_ctl/pgxc_ctl_bash.
- Installing pgxc_ctl_bash script as /data/tbase/pgxc_ctl/pgxc_ctl_bash.
- Reading configuration using /data/tbase/pgxc_ctl/pgxc_ctl_bash --home /data/tbase/pgxc_ctl --configuration /data/tbase/pgxc_ctl/pgxc_ctl.conf
- Finished reading configuration.
- ******** PGXC_CTL START ***************
- Current directory: /data/tbase/pgxc_ctl
- PGXC init all
- Initialize GTM master
- EXECUTE DIRECT ON (dn002) 'ALTER NODE dn002 WITH (TYPE=''datanode'', HOST=''10.240.138.159'', PORT=40004, PREFERRED)';
- EXECUTE DIRECT
- EXECUTE DIRECT ON (dn002) 'SELECT pgxc_pool_reload()';
- pgxc_pool_reload
- ------------------
- t
- (1 row)
- Done.
- 查看集群状态当发现上面的输出时,集群已经OK,另外也可以通过pgxc_ctl工具的monitor all命令来查看集群状态
- [tbase@TENCENT64 ~/pgxc_ctl]$ pgxc_ctl
- /usr/bin/bash
- Installing pgxc_ctl_bash script as /data/tbase/pgxc_ctl/pgxc_ctl_bash.
- Installing pgxc_ctl_bash script as /data/tbase/pgxc_ctl/pgxc_ctl_bash.
- Reading configuration using /data/tbase/pgxc_ctl/pgxc_ctl_bash --home /data/tbase/pgxc_ctl --configuration /data/tbase/pgxc_ctl/pgxc_ctl.conf
- Finished reading configuration.
- ******** PGXC_CTL START ***************
- Current directory: /data/tbase/pgxc_ctl
- PGXC monitor all
- Running: gtm master
- Not running: gtm slave
- Running: coordinator master cn001
- Running: coordinator master cn002
- Running: datanode master dn001
- Running: datanode slave dn001
- Running: datanode master dn002
- Not running: datanode slave dn002
一般的如果配置的不是强同步模式,gtm salve,dn slave的故障不会影响访问。
- 集群访问访问TBase集群和访问单机的PostgreSQL基本上无差别,我们可以通过任意一个CN访问数据库集群:例如通过连接CN节点select pgxc_node表即可查看集群的拓扑结构(当前的配置下备机不会展示在pgxc_node中),在Linux命令行下通过psql访问的具体示例如下
- [tbase@TENCENT64 ~/pgxc_ctl]$ psql -h 10.215.147.158 -p 30004 -d postgres -U tbase
- psql (PostgreSQL 10.0 TBase V2)
- Type "help" for help.
- postgres=# \d
- Did not find any relations.
- postgres=# select * from pgxc_node;
- node_name | node_type | node_port | node_host | nodeis_primary | nodeis_preferred | node_id | node_cluster_name
- -----------+-----------+-----------+----------------+----------------+------------------+------------+-------------------
- gtm | G | 50001 | 10.215.147.158 | t | f | 428125959 | tbase_cluster
- cn001 | C | 30004 | 10.215.147.158 | f | f | -264077367 | tbase_cluster
- cn002 | C | 30004 | 10.240.138.159 | f | f | -674870440 | tbase_cluster
- dn001 | D | 40004 | 10.215.147.158 | t | t | 2142761564 | tbase_cluster
- dn002 | D | 40004 | 10.240.138.159 | f | f | -17499968 | tbase_cluster
- (5 rows)
- 使用数据库前需要创建default group以及sharding表TBase使用datanode group来增加节点的管理灵活度,要求有一个default group才能使用,因此需要预先创建;一般情况下,会将节点的所有datanode节点加入到default group里另外一方面,TBase的数据分布为了增加灵活度,加了中间逻辑层来维护数据记录到物理节点的映射,我们叫sharding,所以需要预先创建sharding,命令如下:
- postgres=# create default node group default_group with (dn001,dn002);
- CREATE NODE GROUP
- postgres=# create sharding group to group default_group;
- CREATE SHARDING GROUP
- 创建数据库,用户,创建表,增删查改等操作至此,就可以跟使用单机数据库一样来访问数据库集群了
- postgres=# create database test;
- CREATE DATABASE
- postgres=# create user test with password 'test';
- CREATE ROLE
- postgres=# alter database test owner to test;
- ALTER DATABASE
- postgres=# \c test test
- You are now connected to database "test" as user "test".
- test=> create table foo(id bigint, str text) distribute by shard(id);
- CREATE TABLE
- test=> insert into foo values(1, 'tencent'), (2, 'shenzhen');
- COPY 2
- test=> select * from foo;
- id | str
- ----+----------
- 1 | tencent
- 2 | shenzhen
- (2 rows)
- 停止集群通过pgxc_ctl工具的stop all命令来停止集群,stop all 后面可以加上参数 -m fast或者是-m immediate来决定如何停止各个节点。
- PGXC stop all -m fast
- Stopping all the coordinator masters.
- Stopping coordinator master cn001.
- Stopping coordinator master cn002.
- Done.
- Stopping all the datanode slaves.
- Stopping datanode slave dn001.
- Stopping datanode slave dn002.
- pg_ctl: PID file "/data/tbase/data/dn002/postmaster.pid" does not exist
- Is server running?
- Stopping all the datanode masters.
- Stopping datanode master dn001.
- Stopping datanode master dn002.
- Done.
- Stop GTM slave
- waiting for server to shut down..... done
- server stopped
- Stop GTM master
- waiting for server to shut down.... done
- server stopped
- PGXC monitor all
- Not running: gtm master
- Not running: gtm slave
- Not running: coordinator master cn001
- Not running: coordinator master cn002
- Not running: datanode master dn001
- Not running: datanode slave dn001
- Not running: datanode master dn002
- Not running: datanode slave dn002
- 启动集群通过pgxc_ctl工具的start all命令来启动集群
- [tbase@TENCENT64 ~]$ pgxc_ctl
- /usr/bin/bash
- Installing pgxc_ctl_bash script as /data/tbase/pgxc_ctl/pgxc_ctl_bash.
- Installing pgxc_ctl_bash script as /data/tbase/pgxc_ctl/pgxc_ctl_bash.
- Reading configuration using /data/tbase/pgxc_ctl/pgxc_ctl_bash --home /data/tbase/pgxc_ctl --configuration /data/tbase/pgxc_ctl/pgxc_ctl.conf
- Finished reading configuration.
- ******** PGXC_CTL START ***************
- Current directory: /data/tbase/pgxc_ctl
- PGXC start all
- 结语
本文档只是给用户一个简单的指引,演示如何从源码开始,一步一步搭建一个完整的TBase集群,后续会有更多的文章来介绍TBase的特性使用,优化,问题定位等内容。