背景

PostgreSQL 已与2019.10.3正式发布,测试其tpcc性能。

环境

阿里云虚拟机

  1. [root@PostgreSQL12 ~]# lscpu
  2. Architecture: x86_64
  3. CPU op-mode(s): 32-bit, 64-bit
  4. Byte Order: Little Endian
  5. CPU(s): 16
  6. On-line CPU(s) list: 0-15
  7. Thread(s) per core: 2
  8. Core(s) per socket: 8
  9. Socket(s): 1
  10. NUMA node(s): 1
  11. Vendor ID: GenuineIntel
  12. CPU family: 6
  13. Model: 85
  14. Model name: Intel(R) Xeon(R) Platinum 8163 CPU @ 2.50GHz
  15. Stepping: 4
  16. CPU MHz: 2500.014
  17. BogoMIPS: 5000.02
  18. Hypervisor vendor: KVM
  19. Virtualization type: full
  20. L1d cache: 32K
  21. L1i cache: 32K
  22. L2 cache: 1024K
  23. L3 cache: 33792K
  24. NUMA node0 CPU(s): 0-15
  25. Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl eagerfpu pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch ibrs ibpb stibp fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx avx512f avx512dq rdseed adx smap avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 spec_ctrl intel_stibp
  26. [root@PostgreSQL12 ~]# lsblk
  27. NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
  28. vda 253:0 0 200G 0 disk
  29. └─vda1 253:1 0 200G 0 part /
  30. vdb 253:16 0 1.8T 0 disk
  31. └─vdb1 253:17 0 1.8T 0 part /data01
  32. vdc 253:32 0 1.8T 0 disk
  33. └─vdc1 253:33 0 1.8T 0 part /data02
  34. [root@PostgreSQL12 ~]# uname -a
  35. Linux PostgreSQL12 3.10.0-957.21.3.el7.x86_64 #1 SMP Tue Jun 18 16:35:19 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
  36. [root@PostgreSQL12 ~]# free -g
  37. total used free shared buff/cache available
  38. Mem: 125 5 79 19 39 91
  39. Swap: 0 0 0

测试方法

1、快设备设置

  1. parted -s /dev/vdb mklabel gpt
  2. parted -s /dev/vdc mklabel gpt
  3. parted -s /dev/vdb mkpart primary 1MiB 100%
  4. parted -s /dev/vdc mkpart primary 1MiB 100%

2、文件系统设置

  1. mkfs.ext4 /dev/vdb1 -m 0 -O extent,uninit_bg -E lazy_itable_init=1 -T largefile -L data01
  2. mkfs.ext4 /dev/vdc1 -m 0 -O extent,uninit_bg -E lazy_itable_init=1 -T largefile -L data02
  3. vi /etc/fstab
  4. LABEL=data01 /data01 ext4 defaults,noatime,nodiratime,nodelalloc,barrier=0,data=writeback 0 0
  5. LABEL=data02 /data02 ext4 defaults,noatime,nodiratime,nodelalloc,barrier=0,data=writeback 0 0
  6. mkdir /data01
  7. mkdir /data02
  8. mount -a

3、系统内核设置

  1. vi /etc/sysctl.conf
  2. # add by digoal.zhou
  3. fs.aio-max-nr = 1048576
  4. fs.file-max = 76724600
  5. # 可选:kernel.core_pattern = /data01/corefiles/core_%e_%u_%t_%s.%p
  6. # /data01/corefiles 事先建好,权限777,如果是软链接,对应的目录修改为777
  7. kernel.sem = 4096 2147483647 2147483646 512000
  8. # 信号量, ipcs -l 或 -u 查看,每16个进程一组,每组信号量需要17个信号量。
  9. kernel.shmall = 107374182
  10. # 所有共享内存段相加大小限制 (建议内存的80%),单位为页。
  11. kernel.shmmax = 274877906944
  12. # 最大单个共享内存段大小 (建议为内存一半), >9.2的版本已大幅降低共享内存的使用,单位为字节。
  13. kernel.shmmni = 819200
  14. # 一共能生成多少共享内存段,每个PG数据库集群至少2个共享内存段
  15. net.core.netdev_max_backlog = 10000
  16. net.core.rmem_default = 262144
  17. # The default setting of the socket receive buffer in bytes.
  18. net.core.rmem_max = 4194304
  19. # The maximum receive socket buffer size in bytes
  20. net.core.wmem_default = 262144
  21. # The default setting (in bytes) of the socket send buffer.
  22. net.core.wmem_max = 4194304
  23. # The maximum send socket buffer size in bytes.
  24. net.core.somaxconn = 4096
  25. net.ipv4.tcp_max_syn_backlog = 4096
  26. net.ipv4.tcp_keepalive_intvl = 20
  27. net.ipv4.tcp_keepalive_probes = 3
  28. net.ipv4.tcp_keepalive_time = 60
  29. net.ipv4.tcp_mem = 8388608 12582912 16777216
  30. net.ipv4.tcp_fin_timeout = 5
  31. net.ipv4.tcp_synack_retries = 2
  32. net.ipv4.tcp_syncookies = 1
  33. # 开启SYN Cookies。当出现SYN等待队列溢出时,启用cookie来处理,可防范少量的SYN攻击
  34. net.ipv4.tcp_timestamps = 1
  35. # 减少time_wait
  36. net.ipv4.tcp_tw_recycle = 0
  37. # 如果=1则开启TCP连接中TIME-WAIT套接字的快速回收,但是NAT环境可能导致连接失败,建议服务端关闭它
  38. net.ipv4.tcp_tw_reuse = 1
  39. # 开启重用。允许将TIME-WAIT套接字重新用于新的TCP连接
  40. net.ipv4.tcp_max_tw_buckets = 262144
  41. net.ipv4.tcp_rmem = 8192 87380 16777216
  42. net.ipv4.tcp_wmem = 8192 65536 16777216
  43. net.nf_conntrack_max = 1200000
  44. net.netfilter.nf_conntrack_max = 1200000
  45. vm.dirty_background_bytes = 409600000
  46. # 系统脏页到达这个值,系统后台刷脏页调度进程 pdflush(或其他) 自动将(dirty_expire_centisecs/100)秒前的脏页刷到磁盘
  47. # 默认为10%,大内存机器建议调整为直接指定多少字节
  48. vm.dirty_expire_centisecs = 3000
  49. # 比这个值老的脏页,将被刷到磁盘。3000表示30秒。
  50. vm.dirty_ratio = 95
  51. # 如果系统进程刷脏页太慢,使得系统脏页超过内存 95 % 时,则用户进程如果有写磁盘的操作(如fsync, fdatasync等调用),则需要主动把系统脏页刷出。
  52. # 有效防止用户进程刷脏页,在单机多实例,并且使用CGROUP限制单实例IOPS的情况下非常有效。
  53. vm.dirty_writeback_centisecs = 100
  54. # pdflush(或其他)后台刷脏页进程的唤醒间隔, 100表示1秒。
  55. vm.swappiness = 0
  56. # 不使用交换分区
  57. vm.mmap_min_addr = 65536
  58. vm.overcommit_memory = 0
  59. # 在分配内存时,允许少量over malloc, 如果设置为 1, 则认为总是有足够的内存,内存较少的测试环境可以使用 1 .
  60. vm.overcommit_ratio = 90
  61. # 当overcommit_memory = 2 时,用于参与计算允许指派的内存大小。
  62. vm.swappiness = 0
  63. # 关闭交换分区
  64. vm.zone_reclaim_mode = 0
  65. # 禁用 numa, 或者在vmlinux中禁止.
  66. net.ipv4.ip_local_port_range = 40000 65535
  67. # 本地自动分配的TCP, UDP端口号范围
  68. fs.nr_open=20480000
  69. # 单个进程允许打开的文件句柄上限
  70. # 以下参数请注意
  71. vm.extra_free_kbytes = 4096000
  72. vm.min_free_kbytes = 2097152 # vm.min_free_kbytes 建议每32G内存分配1G vm.min_free_kbytes
  73. # 如果是小内存机器,以上两个值不建议设置
  74. # vm.nr_hugepages = 66536
  75. # 建议shared buffer设置超过64GB时 使用大页,页大小 /proc/meminfo Hugepagesize
  76. vm.lowmem_reserve_ratio = 1 1 1
  77. # 对于内存大于64G时,建议设置,否则建议默认值 256 256 32
  78. sysctl -p

4、系统资源限制设置

  1. vi /etc/security/limits.conf
  2. * soft nofile 1024000
  3. * hard nofile 1024000
  4. * soft nproc unlimited
  5. * hard nproc unlimited
  6. * soft core unlimited
  7. * hard core unlimited
  8. * soft memlock unlimited
  9. * hard memlock unlimited

5、自启动

  1. vi /etc/rc.local
  2. if test -f /sys/kernel/mm/transparent_hugepage/enabled; then
  3. echo never > /sys/kernel/mm/transparent_hugepage/enabled
  4. fi
  5. su - postgres -c "pg_ctl start"
  1. chmod +x /etc/rc.d/rc.local

6、EPEL包

  1. rpm -ivh https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
  2. yum -y install coreutils glib2 lrzsz dstat sysstat e4fsprogs xfsprogs ntp readline-devel zlib-devel openssl-devel pam-devel libxml2-devel libxslt-devel python-devel tcl-devel gcc gcc-c++ make smartmontools flex bison perl-devel perl-ExtUtils* openldap-devel jadetex openjade bzip2 git iotop

7、PG 12包

  1. yum install -y https://download.postgresql.org/pub/repos/yum/reporpms/EL-7-x86_64/pgdg-redhat-repo-latest.noarch.rpm
  2. yum install -y postgresql12*

8、PG12 环境变量

  1. su - postgres
  2. vi .bash_profile
  3. export PS1="$USER@`/bin/hostname -s`-> "
  4. export PGPORT=1921
  5. export PGDATA=/data01/pg12/pg_root$PGPORT
  6. export LANG=en_US.utf8
  7. export PGHOME=/usr/pgsql-12
  8. export LD_LIBRARY_PATH=$PGHOME/lib:/lib64:/usr/lib64:/usr/local/lib64:/lib:/usr/lib:/usr/local/lib:$LD_LIBRARY_PATH
  9. export DATE=`date +"%Y%m%d%H%M"`
  10. export PATH=$PGHOME/bin:$PATH:.
  11. export MANPATH=$PGHOME/share/man:$MANPATH
  12. export PGHOST=$PGDATA
  13. export PGUSER=postgres
  14. export PGDATABASE=postgres
  15. alias rm='rm -i'
  16. alias ll='ls -lh'
  17. unalias vi

9、部署PG12文件系统

  1. mkdir /data01/pg12
  2. mkdir /data02/pg12
  3. chown postgres:postgres /data01/pg12
  4. chown postgres:postgres /data02/pg12

10、初始化PG12数据库实例

  1. su - postgres
  2. initdb -D $PGDATA -X /data02/pg12/pg_wal1921 -U postgres -E SQL_ASCII --locale=C

11、数据库参数设置

  1. vi $PGDATA/postgresql.auto.conf
  2. listen_addresses = '0.0.0.0'
  3. port = 1921
  4. max_connections = 1000
  5. superuser_reserved_connections = 13
  6. unix_socket_directories = '., /var/run/postgresql, /tmp'
  7. unix_socket_permissions = 0700
  8. tcp_keepalives_idle = 60
  9. tcp_keepalives_interval = 10
  10. tcp_keepalives_count = 10
  11. tcp_user_timeout = 60
  12. shared_buffers = 32GB
  13. maintenance_work_mem = 2GB
  14. dynamic_shared_memory_type = posix
  15. max_files_per_process = 2000
  16. vacuum_cost_delay = 0
  17. bgwriter_delay = 10ms
  18. bgwriter_lru_maxpages = 1000
  19. bgwriter_lru_multiplier = 10.0
  20. effective_io_concurrency = 0
  21. max_worker_processes = 8
  22. max_parallel_maintenance_workers = 4
  23. max_parallel_workers_per_gather = 0
  24. max_parallel_workers = 8
  25. wal_level = minimal
  26. synchronous_commit = off
  27. full_page_writes = off
  28. wal_buffers = 16MB
  29. wal_writer_delay = 10ms
  30. checkpoint_timeout = 15min
  31. max_wal_size = 128GB
  32. min_wal_size = 16GB
  33. checkpoint_completion_target = 0.1
  34. max_wal_senders = 0
  35. random_page_cost = 1.2
  36. effective_cache_size = 128GB
  37. jit = off
  38. log_destination = 'csvlog'
  39. logging_collector = on
  40. log_directory = 'log'
  41. log_filename = 'postgresql-%a.log'
  42. log_truncate_on_rotation = on
  43. log_rotation_age = 1d
  44. log_rotation_size = 0
  45. log_checkpoints = on
  46. log_error_verbosity = verbose
  47. log_line_prefix = '%m [%p] '
  48. log_statement = 'ddl'
  49. log_timezone = 'Asia/Shanghai'
  50. autovacuum = on
  51. log_autovacuum_min_duration = 0
  52. autovacuum_max_workers = 3
  53. autovacuum_vacuum_scale_factor = 0.02
  54. autovacuum_analyze_scale_factor = 0.01
  55. autovacuum_freeze_max_age = 800000000
  56. autovacuum_multixact_freeze_max_age = 900000000
  57. autovacuum_vacuum_cost_delay = 0ms
  58. vacuum_freeze_min_age = 500000000
  59. vacuum_freeze_table_age = 750000000
  60. vacuum_multixact_freeze_min_age = 5000000
  61. vacuum_multixact_freeze_table_age = 750000000
  62. datestyle = 'iso, mdy'
  63. timezone = 'Asia/Shanghai'
  64. lc_messages = 'C'
  65. lc_monetary = 'C'
  66. lc_numeric = 'C'
  67. lc_time = 'C'
  68. default_text_search_config = 'pg_catalog.english'

12、数据库防火墙设置

  1. vi $PGDATA/pg_hba.conf
  2. host all all 192.168.0.0/24 trust

13、启动数据库

  1. pg_ctl start

14、数据库表空间设置

  1. mkdir /data01/pg12/tbs1
  2. mkdir /data02/pg12/tbs2
  3. psql
  4. create tablespace tbs1 location '/data01/pg12/tbs1';
  5. create tablespace tbs2 location '/data02/pg12/tbs2';

15、sysbench部署

  1. curl -s https://packagecloud.io/install/repositories/akopytov/sysbench/script.rpm.sh | sudo bash
  2. sudo yum -y install sysbench
  3. su - postgres
  4. git clone https://github.com/digoal/sysbench-tpcc
  5. cd sysbench-tpcc
  6. chmod 700 *.lua

16、清理数据方法

  1. drop schema public cascade;
  2. create schema public;
  3. grant all on schema public to public;

17、初始化数据(装载速度约每秒37MB)

  1. export pgsql_table_options="tablespace tbs1"
  2. export pgsql_index_options="tablespace tbs2"

测1000个仓库(1套表,112GB)

  1. nohup time ./tpcc.lua --pgsql-host=/tmp --pgsql-port=1921 --pgsql-user=postgres --pgsql-db=postgres --threads=64 --tables=1 --scale=1000 --trx_level=RC --db-ps-mode=auto --db-driver=pgsql prepare >./out.log 2>&1 &

测10000个仓库(10套表,每套1000个仓库, 1120GB)

  1. nohup time ./tpcc.lua --pgsql-host=/tmp --pgsql-port=1921 --pgsql-user=postgres --pgsql-db=postgres --threads=64 --tables=10 --scale=1000 --trx_level=RC --db-ps-mode=auto --db-driver=pgsql prepare >./out.log 2>&1 &

18、压测

远程建议3倍cpu客户端,本地建议2倍cpu客户端。

run 时不调用purge

  1. 1000个仓库
  2. ./tpcc.lua --pgsql-host=/tmp --pgsql-port=1921 --pgsql-user=postgres --pgsql-db=postgres --threads=64 --tables=1 --scale=1000 --trx_level=RC --db-ps-mode=auto --db-driver=pgsql --time=3600 --report-interval=5 run
  3. 10000个仓库
  4. ./tpcc.lua --pgsql-host=/tmp --pgsql-port=1921 --pgsql-user=postgres --pgsql-db=postgres --threads=64 --tables=10 --scale=1000 --trx_level=RC --db-ps-mode=auto --db-driver=pgsql --time=3600 --report-interval=5 run

run 时调用purge

  1. 1000个仓库
  2. ./tpcc.lua --pgsql-host=/tmp --pgsql-port=1921 --pgsql-user=postgres --pgsql-db=postgres --threads=64 --tables=1 --scale=1000 --trx_level=RC --db-ps-mode=auto --db-driver=pgsql --time=3600 --report-interval=5 --enable_purge=yes run
  3. 10000个仓库
  4. ./tpcc.lua --pgsql-host=/tmp --pgsql-port=1921 --pgsql-user=postgres --pgsql-db=postgres --threads=64 --tables=10 --scale=1000 --trx_level=RC --db-ps-mode=auto --db-driver=pgsql --time=3600 --report-interval=5 --enable_purge=yes run

cleanup

  1. 1000个仓库
  2. ./tpcc.lua --pgsql-host=/tmp --pgsql-port=1921 --pgsql-user=postgres --pgsql-db=postgres --threads=64 --tables=1 --scale=1000 --trx_level=RC --db-driver=pgsql cleanup
  3. 10000个仓库
  4. ./tpcc.lua --pgsql-host=/tmp --pgsql-port=1921 --pgsql-user=postgres --pgsql-db=postgres --threads=64 --tables=10 --scale=1000 --trx_level=RC --db-driver=pgsql cleanup

结果解读

run 时不调用purge

  1. ./tpcc.lua --pgsql-host=/tmp --pgsql-port=1921 --pgsql-user=postgres --pgsql-db=postgres --threads=32 --tables=1 --scale=1000 --trx_level=RC --db-ps-mode=auto --db-driver=pgsql --time=60 --report-interval=5 run
  2. sysbench 1.0.17 (using system LuaJIT 2.0.4)
  3. Running the test with following options:
  4. Number of threads: 32
  5. Report intermediate results every 5 second(s)
  6. Initializing random number generator from current time
  7. Initializing worker threads...
  8. Threads started!
  9. [ 5s ] thds: 32 tps: 3248.44 qps: 93258.54 (r/w/o: 42390.64/44038.35/6829.54) lat (ms,95%): 27.66 err/s 10.00 reconn/s: 0.00
  10. [ 10s ] thds: 32 tps: 3626.37 qps: 102832.62 (r/w/o: 46883.60/48696.28/7252.74) lat (ms,95%): 23.52 err/s 14.00 reconn/s: 0.00
  11. [ 15s ] thds: 32 tps: 3838.38 qps: 109478.46 (r/w/o: 49903.95/51897.94/7676.56) lat (ms,95%): 21.50 err/s 18.60 reconn/s: 0.00
  12. [ 20s ] thds: 32 tps: 4006.41 qps: 114816.04 (r/w/o: 52365.11/54437.71/8013.22) lat (ms,95%): 20.00 err/s 19.20 reconn/s: 0.00
  13. [ 25s ] thds: 32 tps: 4103.01 qps: 116394.38 (r/w/o: 53051.28/55137.28/8205.81) lat (ms,95%): 20.00 err/s 17.20 reconn/s: 0.00
  14. [ 30s ] thds: 32 tps: 4115.59 qps: 116128.74 (r/w/o: 52981.68/54915.87/8231.18) lat (ms,95%): 20.00 err/s 15.20 reconn/s: 0.00
  15. [ 35s ] thds: 32 tps: 4109.69 qps: 117433.18 (r/w/o: 53571.93/55641.86/8219.39) lat (ms,95%): 19.65 err/s 19.19 reconn/s: 0.00
  16. [ 40s ] thds: 32 tps: 4169.11 qps: 118802.26 (r/w/o: 54157.77/56306.27/8338.22) lat (ms,95%): 19.65 err/s 15.81 reconn/s: 0.00
  17. [ 45s ] thds: 32 tps: 4170.78 qps: 118412.12 (r/w/o: 53997.63/56072.92/8341.57) lat (ms,95%): 19.65 err/s 18.80 reconn/s: 0.00
  18. [ 50s ] thds: 32 tps: 4225.57 qps: 120878.63 (r/w/o: 55162.50/57264.98/8451.15) lat (ms,95%): 19.65 err/s 22.20 reconn/s: 0.00
  19. [ 55s ] thds: 32 tps: 4128.25 qps: 116929.64 (r/w/o: 53310.25/55362.88/8256.51) lat (ms,95%): 20.00 err/s 19.40 reconn/s: 0.00
  20. [ 60s ] thds: 32 tps: 4096.19 qps: 116335.90 (r/w/o: 53103.86/55039.66/8192.38) lat (ms,95%): 20.37 err/s 18.00 reconn/s: 0.00

统计方法:

  1. SQL statistics:
  2. queries performed:
  3. read: 3104738
  4. write: 3224417
  5. other: 480086 -- 统计 begin;commit;rollback;
  6. total: 6809241 -- 统计所有请求,以上相加
  7. transactions: 239227 (3973.25 per sec.) -- 统计每秒完成事务数(不包括rollback;) 使用这个计算 total tpmc = 3973.25*60 = 238395
  8. queries: 6809241 (113092.77 per sec.) -- 所有请求
  9. ignored errors: 1038 (17.24 per sec.)
  10. reconnects: 0 (0.00 per sec.)
  11. General statistics:
  12. total time: 60.2077s
  13. total number of events: 239227
  14. Latency (ms):
  15. min: 0.42
  16. avg: 8.02 -- 平均事务处理时间
  17. max: 329.15
  18. 95th percentile: 20.37 -- 95% 的事务处理时间低于 20.37 ms
  19. sum: 1919757.02 -- 总耗时= --threads=32 乘以 --time=60
  20. Threads fairness:
  21. events (avg/stddev): 7475.8438/78.44
  22. execution time (avg/stddev): 59.9924/0.01

统计结果如下:

  1. total tpmc= 3973.25*60=238395
  2. new orders tpmc= (total tpmc)*(10/23) = 103650 # (取决于run时是否 ```--enable_purge=yes```) 或 调用purge : (total tpmc)*(10/24)
  1. function event()
  2. -- print( NURand (1023,1,3000))
  3. local max_trx = sysbench.opt.enable_purge == "yes" and 24 or 23
  4. local trx_type = sysbench.rand.uniform(1,max_trx)
  5. if trx_type <= 10 then
  6. trx="new_order"
  7. elseif trx_type <= 20 then
  8. trx="payment"
  9. elseif trx_type <= 21 then
  10. trx="orderstatus"
  11. elseif trx_type <= 22 then
  12. trx="delivery"
  13. elseif trx_type <= 23 then
  14. trx="stocklevel"
  15. elseif trx_type <= 24 then
  16. trx="purge"
  17. end

32c64ht 512G 1000仓库 机器测试结果

  1. Architecture: x86_64
  2. CPU op-mode(s): 32-bit, 64-bit
  3. Byte Order: Little Endian
  4. CPU(s): 64
  5. On-line CPU(s) list: 0-63
  6. Thread(s) per core: 2
  7. Core(s) per socket: 32
  8. Socket(s): 1
  9. NUMA node(s): 1
  10. Vendor ID: GenuineIntel
  11. CPU family: 6
  12. Model: 85
  13. Model name: Intel(R) Xeon(R) Platinum 8163 CPU @ 2.50GHz
  14. Stepping: 4
  15. CPU MHz: 2500.008
  16. BogoMIPS: 5000.01
  17. Hypervisor vendor: KVM
  18. Virtualization type: full
  19. L1d cache: 32K
  20. L1i cache: 32K
  21. L2 cache: 1024K
  22. L3 cache: 33792K
  23. NUMA node0 CPU(s): 0-63
  24. Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl eagerfpu pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch ibrs ibpb stibp fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx avx512f avx512dq rdseed adx smap avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 spec_ctrl intel_stibp
  25. [root@pg11-test ~]# free -g
  26. total used free shared buff/cache available
  27. Mem: 503 313 3 17 186 170
  28. Swap: 0 0 0
  29. [root@pg11-test ~]# lsblk
  30. NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
  31. vda 253:0 0 200G 0 disk
  32. └─vda1 253:1 0 200G 0 part /
  33. vdb 253:16 0 1.8T 0 disk
  34. ├─vgdata01-lv01 252:0 0 4T 0 lvm /data01
  35. ├─vgdata01-lv02 252:1 0 4T 0 lvm /data02
  36. ├─vgdata01-lv03 252:2 0 4T 0 lvm /data03
  37. └─vgdata01-lv04 252:3 0 2T 0 lvm /data04
  38. vdc 253:32 0 1.8T 0 disk
  39. ├─vgdata01-lv01 252:0 0 4T 0 lvm /data01
  40. ├─vgdata01-lv02 252:1 0 4T 0 lvm /data02
  41. ├─vgdata01-lv03 252:2 0 4T 0 lvm /data03
  42. └─vgdata01-lv04 252:3 0 2T 0 lvm /data04
  43. vdd 253:48 0 1.8T 0 disk
  44. ├─vgdata01-lv01 252:0 0 4T 0 lvm /data01
  45. ├─vgdata01-lv02 252:1 0 4T 0 lvm /data02
  46. ├─vgdata01-lv03 252:2 0 4T 0 lvm /data03
  47. └─vgdata01-lv04 252:3 0 2T 0 lvm /data04
  48. vde 253:64 0 1.8T 0 disk
  49. ├─vgdata01-lv01 252:0 0 4T 0 lvm /data01
  50. ├─vgdata01-lv02 252:1 0 4T 0 lvm /data02
  51. ├─vgdata01-lv03 252:2 0 4T 0 lvm /data03
  52. └─vgdata01-lv04 252:3 0 2T 0 lvm /data04
  53. vdf 253:80 0 1.8T 0 disk
  54. ├─vgdata01-lv01 252:0 0 4T 0 lvm /data01
  55. ├─vgdata01-lv02 252:1 0 4T 0 lvm /data02
  56. ├─vgdata01-lv03 252:2 0 4T 0 lvm /data03
  57. └─vgdata01-lv04 252:3 0 2T 0 lvm /data04
  58. vdg 253:96 0 1.8T 0 disk
  59. ├─vgdata01-lv01 252:0 0 4T 0 lvm /data01
  60. ├─vgdata01-lv02 252:1 0 4T 0 lvm /data02
  61. ├─vgdata01-lv03 252:2 0 4T 0 lvm /data03
  62. └─vgdata01-lv04 252:3 0 2T 0 lvm /data04
  63. vdh 253:112 0 1.8T 0 disk
  64. ├─vgdata01-lv01 252:0 0 4T 0 lvm /data01
  65. ├─vgdata01-lv02 252:1 0 4T 0 lvm /data02
  66. ├─vgdata01-lv03 252:2 0 4T 0 lvm /data03
  67. └─vgdata01-lv04 252:3 0 2T 0 lvm /data04
  68. vdi 253:128 0 1.8T 0 disk
  69. ├─vgdata01-lv01 252:0 0 4T 0 lvm /data01
  70. ├─vgdata01-lv02 252:1 0 4T 0 lvm /data02
  71. ├─vgdata01-lv03 252:2 0 4T 0 lvm /data03
  72. └─vgdata01-lv04 252:3 0 2T 0 lvm /data04
  73. [root@pg11-test ~]# pvs
  74. PV VG Fmt Attr PSize PFree
  75. /dev/vdb vgdata01 lvm2 a-- <1.75t 0
  76. /dev/vdc vgdata01 lvm2 a-- <1.75t 0
  77. /dev/vdd vgdata01 lvm2 a-- <1.75t 0
  78. /dev/vde vgdata01 lvm2 a-- <1.75t 0
  79. /dev/vdf vgdata01 lvm2 a-- <1.75t 0
  80. /dev/vdg vgdata01 lvm2 a-- <1.75t 0
  81. /dev/vdh vgdata01 lvm2 a-- <1.75t 0
  82. /dev/vdi vgdata01 lvm2 a-- <1.75t 0
  83. [root@pg11-test ~]# vgs
  84. VG #PV #LV #SN Attr VSize VFree
  85. vgdata01 8 4 0 wz--n- <13.97t 0
  86. [root@pg11-test ~]# lvs
  87. LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
  88. lv01 vgdata01 -wi-ao---- 4.00t
  89. lv02 vgdata01 -wi-ao---- 4.00t
  90. lv03 vgdata01 -wi-ao---- 4.00t
  91. lv04 vgdata01 -wi-ao---- <1.97t
  92. [root@pg11-test ~]# lvdisplay -vv
  93. devices/global_filter not found in config: defaulting to global_filter = [ "a|.*/|" ]
  94. Setting global/locking_type to 1
  95. Setting global/use_lvmetad to 1
  96. global/lvmetad_update_wait_time not found in config: defaulting to 10
  97. Setting response to OK
  98. Setting protocol to lvmetad
  99. Setting version to 1
  100. Setting global/use_lvmpolld to 1
  101. Setting devices/sysfs_scan to 1
  102. Setting devices/multipath_component_detection to 1
  103. Setting devices/md_component_detection to 1
  104. Setting devices/fw_raid_component_detection to 0
  105. Setting devices/ignore_suspended_devices to 0
  106. Setting devices/ignore_lvm_mirrors to 1
  107. devices/filter not found in config: defaulting to filter = [ "a|.*/|" ]
  108. Setting devices/cache_dir to /etc/lvm/cache
  109. Setting devices/cache_file_prefix to
  110. devices/cache not found in config: defaulting to /etc/lvm/cache/.cache
  111. Setting devices/write_cache_state to 1
  112. Setting global/use_lvmetad to 1
  113. Setting activation/activation_mode to degraded
  114. metadata/record_lvs_history not found in config: defaulting to 0
  115. Setting activation/monitoring to 1
  116. Setting global/locking_type to 1
  117. Setting global/wait_for_locks to 1
  118. File-based locking selected.
  119. Setting global/prioritise_write_locks to 1
  120. Setting global/locking_dir to /run/lock/lvm
  121. Setting global/use_lvmlockd to 0
  122. Setting response to OK
  123. Setting token to filter:3239235440
  124. Setting daemon_pid to 11015
  125. Setting response to OK
  126. Setting global_disable to 0
  127. report/output_format not found in config: defaulting to basic
  128. log/report_command_log not found in config: defaulting to 0
  129. Obtaining the complete list of VGs before processing their LVs
  130. Setting response to OK
  131. Setting response to OK
  132. Setting name to vgdata01
  133. Processing VG vgdata01 jwrfAR-tEXe-qf6u-rd95-yhPW-O7Xw-JPUjyr
  134. Locking /run/lock/lvm/V_vgdata01 RB
  135. Reading VG vgdata01 jwrfAR-tEXe-qf6u-rd95-yhPW-O7Xw-JPUjyr
  136. Setting response to OK
  137. Setting response to OK
  138. Setting name to vgdata01
  139. Setting metadata/format to lvm2
  140. Setting id to 8Wny3c-lLb1-27xY-9rFC-HCOc-XsaD-HmvN5l
  141. Setting format to lvm2
  142. Setting device to 64784
  143. Setting dev_size to 3749707776
  144. Setting label_sector to 1
  145. Setting ext_flags to 1
  146. Setting ext_version to 2
  147. Setting size to 1044480
  148. Setting start to 4096
  149. Setting ignore to 0
  150. Setting id to ClcfJi-9Omy-hZdN-ll46-B6J2-fAAL-MLrleE
  151. Setting format to lvm2
  152. Setting device to 64800
  153. Setting dev_size to 3749707776
  154. Setting label_sector to 1
  155. Setting ext_flags to 1
  156. Setting ext_version to 2
  157. Setting size to 1044480
  158. Setting start to 4096
  159. Setting ignore to 0
  160. Setting id to uFhANC-PCAV-JwJL-zSNn-O8np-I2Wi-ue8Vv1
  161. Setting format to lvm2
  162. Setting device to 64816
  163. Setting dev_size to 3749707776
  164. Setting label_sector to 1
  165. Setting ext_flags to 1
  166. Setting ext_version to 2
  167. Setting size to 1044480
  168. Setting start to 4096
  169. Setting ignore to 0
  170. Setting id to hKBbU0-a3gm-sHq1-eU7Q-ZJ3m-Iwoo-MuKzzj
  171. Setting format to lvm2
  172. Setting device to 64832
  173. Setting dev_size to 3749707776
  174. Setting label_sector to 1
  175. Setting ext_flags to 1
  176. Setting ext_version to 2
  177. Setting size to 1044480
  178. Setting start to 4096
  179. Setting ignore to 0
  180. Setting id to cOZaeJ-Drns-9BcP-5Aoq-oZ88-0hVs-M7K8SU
  181. Setting format to lvm2
  182. Setting device to 64848
  183. Setting dev_size to 3749707776
  184. Setting label_sector to 1
  185. Setting ext_flags to 1
  186. Setting ext_version to 2
  187. Setting size to 1044480
  188. Setting start to 4096
  189. Setting ignore to 0
  190. Setting id to EgaC5R-Q0An-X79Q-xGRL-5zDI-MN16-lclIBO
  191. Setting format to lvm2
  192. Setting device to 64864
  193. Setting dev_size to 3749707776
  194. Setting label_sector to 1
  195. Setting ext_flags to 1
  196. Setting ext_version to 2
  197. Setting size to 1044480
  198. Setting start to 4096
  199. Setting ignore to 0
  200. Setting id to NnvDT4-eUM4-V2dP-Fqv1-O28z-OVoH-z939Bh
  201. Setting format to lvm2
  202. Setting device to 64880
  203. Setting dev_size to 3749707776
  204. Setting label_sector to 1
  205. Setting ext_flags to 1
  206. Setting ext_version to 2
  207. Setting size to 1044480
  208. Setting start to 4096
  209. Setting ignore to 0
  210. Setting id to XZsZfn-y2aH-MiNA-mo95-jpdQ-Jufp-eIgBga
  211. Setting format to lvm2
  212. Setting device to 64896
  213. Setting dev_size to 3749707776
  214. Setting label_sector to 1
  215. Setting ext_flags to 1
  216. Setting ext_version to 2
  217. Setting size to 1044480
  218. Setting start to 4096
  219. Setting ignore to 0
  220. Setting response to OK
  221. Setting response to OK
  222. /dev/vdb: size is 3749707776 sectors
  223. /dev/vdc: size is 3749707776 sectors
  224. /dev/vdd: size is 3749707776 sectors
  225. /dev/vde: size is 3749707776 sectors
  226. /dev/vdf: size is 3749707776 sectors
  227. /dev/vdg: size is 3749707776 sectors
  228. /dev/vdh: size is 3749707776 sectors
  229. /dev/vdi: size is 3749707776 sectors
  230. Adding vgdata01/lv01 to the list of LVs to be processed.
  231. Adding vgdata01/lv02 to the list of LVs to be processed.
  232. Adding vgdata01/lv03 to the list of LVs to be processed.
  233. Adding vgdata01/lv04 to the list of LVs to be processed.
  234. Processing LV lv01 in VG vgdata01.
  235. --- Logical volume ---
  236. global/lvdisplay_shows_full_device_path not found in config: defaulting to 0
  237. LV Path /dev/vgdata01/lv01
  238. LV Name lv01
  239. VG Name vgdata01
  240. LV UUID GtVTn9-mWcL-sTJA-QyRq-VocV-eu1s-374mkU
  241. LV Write Access read/write
  242. LV Creation host, time pg11-test, 2018-08-24 20:44:30 +0800
  243. LV Status available
  244. # open 1
  245. LV Size 4.00 TiB
  246. Current LE 32768
  247. Segments 1
  248. Allocation inherit
  249. Read ahead sectors auto
  250. - currently set to 8192
  251. Block device 252:0
  252. Processing LV lv02 in VG vgdata01.
  253. --- Logical volume ---
  254. global/lvdisplay_shows_full_device_path not found in config: defaulting to 0
  255. LV Path /dev/vgdata01/lv02
  256. LV Name lv02
  257. VG Name vgdata01
  258. LV UUID 17VdCH-KVNZ-FF3a-g7ic-IY4y-qav3-jdX3CJ
  259. LV Write Access read/write
  260. LV Creation host, time pg11-test, 2018-08-24 20:44:37 +0800
  261. LV Status available
  262. # open 1
  263. LV Size 4.00 TiB
  264. Current LE 32768
  265. Segments 1
  266. Allocation inherit
  267. Read ahead sectors auto
  268. - currently set to 8192
  269. Block device 252:1
  270. Processing LV lv03 in VG vgdata01.
  271. --- Logical volume ---
  272. global/lvdisplay_shows_full_device_path not found in config: defaulting to 0
  273. LV Path /dev/vgdata01/lv03
  274. LV Name lv03
  275. VG Name vgdata01
  276. LV UUID XY3M0w-EJdu-rx4z-Jn9n-QigT-mAVi-zps4te
  277. LV Write Access read/write
  278. LV Creation host, time pg11-test, 2018-08-24 20:44:57 +0800
  279. LV Status available
  280. # open 1
  281. LV Size 4.00 TiB
  282. Current LE 32768
  283. Segments 1
  284. Allocation inherit
  285. Read ahead sectors auto
  286. - currently set to 8192
  287. Block device 252:2
  288. Processing LV lv04 in VG vgdata01.
  289. --- Logical volume ---
  290. global/lvdisplay_shows_full_device_path not found in config: defaulting to 0
  291. LV Path /dev/vgdata01/lv04
  292. LV Name lv04
  293. VG Name vgdata01
  294. LV UUID vWtHPq-ycHf-n8AO-3E0V-R5F6-WTXc-LocpJ8
  295. LV Write Access read/write
  296. LV Creation host, time pg11-test, 2018-09-28 10:08:27 +0800
  297. LV Status available
  298. # open 1
  299. LV Size <1.97 TiB
  300. Current LE 16120
  301. Segments 1
  302. Allocation inherit
  303. Read ahead sectors auto
  304. - currently set to 8192
  305. Block device 252:3
  306. Unlocking /run/lock/lvm/V_vgdata01
  307. Setting global/notify_dbus to 1
  308. ./tpcc.lua --pgsql-host=/tmp --pgsql-port=4801 --pgsql-user=postgres --pgsql-db=postgres --threads=96 --tables=1 --scale=1000 --trx_level=RC --db-ps-mode=auto --db-driver=pgsql --time=60 --report-interval=5 --enable_purge=yes run
  309. SQL statistics:
  310. queries performed:
  311. read: 12443189
  312. write: 12830786
  313. other: 1992972
  314. total: 27266947
  315. transactions: 994038 (16549.36 per sec.)
  316. queries: 27266947 (453957.01 per sec.)
  317. ignored errors: 4229 (70.41 per sec.)
  318. reconnects: 0 (0.00 per sec.)
  319. General statistics:
  320. total time: 60.0634s
  321. total number of events: 994038
  322. Latency (ms):
  323. min: 0.36
  324. avg: 5.79
  325. max: 138.96
  326. 95th percentile: 16.41
  327. sum: 5757585.45
  328. Threads fairness:
  329. events (avg/stddev): 10354.5625/127.26
  330. execution time (avg/stddev): 59.9748/0.01
  331. 数据装载速度 89.5 MB/s
  332. tpmc total 99.3
  333. tpcm neworder 41.4
  1. 1000仓库结果
  2. tpmc total: 36
  3. tpmc neworder : 15
  4. 瓶颈:io

性能小结

1、pg 12 (单机自建) ecs 16c128g + 1.8T local ssd*2
1000仓库,64并发,tpmc total: 26万
10000仓库,64并发,tpmc total: 13万

2、pg 12 (单机自建) ecs 64c512g + 1.8T local ssd*8
1000仓库,64并发,tpmc total: 99万
10000仓库,64并发,tpmc total: 41万

3、pg 12 (单机自建)(104c 768g,essd 32TB,hugepage,sharedbuffer=600GB)
unlogged table:
1000仓库,208并发,tpmc total: 184万
logged table:
1000仓库,104并发,tpmc total: 168万

参考

《PostgreSQL 11 tpcc 测试(103万tpmC on ECS) - use sysbench-tpcc by Percona-Lab》

https://github.com/digoal/sysbench-tpcc/blob/master/README.md