Greenplum 2+4測試集羣搭建
概述:
本文檔用於Centos7.6系統下搭建Grennplum 5.25版本“2+4節點”的測試集羣。集羣規劃如下:
主機名 | 角色 | 硬件配置 | 網卡配置 |
---|---|---|---|
master | master節點 | 4core+8G+2*30G | 192.168.0.137;192.168.0.138 |
standby | standby節點 | 4core+8G+2*30G | 192.168.0.139;192.168.0.140 |
seg-1 | 計算節點1 | 4core+8G+2*30G | 192.168.0.142;192.168.0.143 |
seg-2 | 計算節點2 | 4core+8G+2*30G | 192.168.0.145;192.168.0.146 |
seg-3 | 計算節點3 | 4core+8G+2*30G | 192.168.0.151;192.168.0.152 |
seg-4 | 計算節點4 | 4core+8G+2*30G | 192.168.0.153;192.168.0.154 |
磁盤空間劃分:
名稱 | 磁盤 | 大小 |
---|---|---|
/boot | sda | 200M |
swap | sda | 8G |
/ | sda | 21.8G |
/data | sdb | 30G |
一、Linux服務調整(所有服務器)
1、關閉防火牆
systemctl stop firewalld
systemctl disable firewalld
systemctl status firewalld
2、關閉Selinux
setenforce 0
vim /etc/selinux/config
3、關閉不必要的服務
系統服務名 | 摘要 | 建議 |
---|---|---|
abrt-oops | 內核嚴重異常信息收集工具 | off |
abrt-ccpp | C/C++程序問題收集工具 | off |
lvm2-monitor | lvm的mirror監控進程 | off |
NetworkManager | 網絡自動管理進程 | off |
Postfix | 電子郵件服務 | off |
Rhnsd | 操作系統自動更新進程 | off |
Rhsmcertd | 在線更新訂閱狀態信息 | off |
Rpcbind | RPC服務相關 | off |
Rpcgssd | 針對nfsv4客戶端管理RPCSEC GSS contexts進程 | off |
Rpcidmapd | RPC名字與UID/GID映射進程 | off |
4、修改sshd服務的參數
修改/etc/ssh/sshd_config文件:
vim /etc/ssh/sshd_config
#MaxStartups 10:30:100
MaxStartups 5000:30:10000
二、系統參數配置
1、修改內核參數
vim /etc/sysctl.conf
kernel.shmmax = 500000000
kernel.shmmni = 4096
kernel.shmall = 4000000000
kernel.sem = 500 1024000 200 4096
kernel.sysrq = 1
kernel.core_uses_pid = 1
kernel.msgmnb = 65536
kernel.msgmax = 65536
kernel.msgmni = 2048
net.ipv4.tcp_syncookies = 1 net.ipv4.conf.default.accept_source_route = 0 net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_max_syn_backlog = 4096 net.ipv4.conf.all.arp_filter = 1 net.ipv4.ip_local_port_range = 1025 65535 net.core.netdev_max_backlog = 10000
net.core.rmem_max = 2097152 net.core.wmem_max = 2097152
vm.overcommit_memory = 2
net.ipv4.ip_forward = 0
sysctl -p //使修改生效
2、配置用戶資源限額
vim /etc/security/limits.conf
* soft nofile 1048576
* hard nofile 1048576
* soft nproc 131072
* hard nproc 131072
vim /etc/security/limits.d/20-nproc.conf
* soft nofile 1048576
* hard nofile 1048576
* soft nproc 131072
* hard nproc 131072
3、配置數據盤掛載參數
vim /etc/fstab
將defaults替換爲‘nodev,noatime,nobarrier,inode64,allocsize=16m’
4、配置磁盤預讀取
echo "blockdev --setra 16384 /dev/sda" >> /etc/rc.d/rc.local
echo "blockdev --setra 16384 /dev/sdb" >> /etc/rc.d/rc.local
chmod +X /etc/rc.d/rc.local //爲文件添加執行權限
5、修改磁盤調度算法(elevator=deadline)和磁盤IO參數(hugepage)
cat /sys/block/sda/queue/scheduler
grubby --update-kernel=ALL --args="elevator=deadline"
grubby --update-kernel=ALL --args="transparent_hugepage=never"
重啓生效
reboot
檢查相關配置是否生效
blockdev --getra /dev/sda
blockdev --getra /dev/sdb
ulimit -n
ulimit -u
cat /sys/block/*/queue/scheduler
cat /sys/kernel/mm/transparent_hugepage/enabled
getenforce
6、配置hosts解析
vim /etc/hosts
##
#ens160:
192.168.0.137 master
192.168.0.139 standby
192.168.0.142 seg-1
192.168.0.145 seg-2
192.168.0.151 seg-3
192.168.0.153 seg-4
##
#ens192:
192.168.0.138 eth2-master
192.168.0.140 eth2-standby
192.168.0.143 eth2-seg-1
192.168.0.146 eth2-seg-2
192.168.0.152 eth2-seg-3
192.168.0.154 eth2-seg-4
將/etc/hosts文件scp到其他主機上:
scp /etc/hosts ip:/etc/hosts
三、Greenplum軟件安裝
1、Greenplum軟件下載與解壓
[root@master ~]# ll
total 276820
-rw-------. 1 root root 2095 Mar 30 14:53 anaconda-ks.cfg
-rwxr-xr-x 1 root root 283451930 Mar 31 15:43 greenplum-db-5.25.0-rhel7-x86_64.bin
-rw-r--r--. 1 root root 2126 Mar 30 16:14 initial-setup-ks.cfg
2、依賴檢查
rpm -qa | grep zlib
rpm -qa | grep bash
rpm -qa | grep zip
rpm -qa | grep tar
rpm -qa | grep sed
rpm -qa | grep perl
3、安裝
./greenplum-db-5.25.0-rhel7-x86_64.bin
[root@master ~]# ll /usr/local/
total 0
drwxr-xr-x. 2 root root 6 Apr 11 2018 bin
drwxr-xr-x. 2 root root 6 Apr 11 2018 etc
drwxr-xr-x. 2 root root 6 Apr 11 2018 games
lrwxrwxrwx 1 root root 21 Mar 31 16:07 greenplum-db -> ./greenplum-db-5.25.0
drwxr-xr-x 11 root root 138 Mar 31 16:07 greenplum-db-5.25.0
drwxr-xr-x. 2 root root 6 Apr 11 2018 include
drwxr-xr-x. 2 root root 6 Apr 11 2018 lib
drwxr-xr-x. 2 root root 6 Apr 11 2018 lib64
drwxr-xr-x. 2 root root 6 Apr 11 2018 libexec
drwxr-xr-x. 2 root root 6 Apr 11 2018 sbin
drwxr-xr-x. 5 root root 49 Mar 30 14:40 share
drwxr-xr-x. 2 root root 6 Apr 11 2018 src
使GP的環境變量生效
source /usr/local/greenplum-db/greenplum_path.sh
切換窗口或用戶會使環境變量失效,需要重新source。
4、配置主機互信
1、創建host file文件
[root@master greenplum-db]# echo "master
> standby
> seg-1
> seg-2
> seg-3
> seg-4
> eth2-master
> eth2-standby
> eth2-seg-1
> eth2-seg-2
> eth2-seg-3
> eth2-seg-4">hostfile_exkeys
2、通過gpssh-exkeys程序配置主機間互信
gpssh-exkeys -f hostfile_exkeys
5、對其他主機安裝Greenplum軟件
在master使用gpseginstall程序對所有主機安裝Greenplum軟件,軟件安裝過程中會自動創建gpadmin用戶組與用戶。
這一步一定要在GP安裝目錄下完成
gpseginstall -f hostfile_exkeys
6、創建數據目錄並授權
創建master、standby的數據目錄並賦權
touch hostfile_masters //創建只包含master、standby的hostfile
echo "master>standby">hostfile_masters
gpssh -f hostfile_masters -e 'mkdir /data/master'
gpssh -f hostfile_masters -e 'chown -R gpadmin:gpadmin /data/master'
創建segments的數據目錄並賦權
[root@master greenplum-db]# echo "seg-1
> seg-2
> seg-3
> seg-4">hostfile_segments
[root@master greenplum-db]# gpssh -f hostfile_segments -e 'mkdir /data/primary'
[root@master greenplum-db]# gpssh -f hostfile_segments -e 'mkdir /data/mirror'
[root@master greenplum-db]# gpssh -f hostfile_segments -e 'chown -R gpadmin:gpadmin /data/mirror'
[root@master greenplum-db]# gpssh -f hostfile_segments -e 'chown -R gpadmin:gpadmin /data/primary'
7、配置時間同步服務
啓動ntpd服務
gpssh -f hostfile_exkeys
systemctl start ntpd
systemctl enable ntpd
所有主機設置定時計劃,同步時間
crontab -e
輸入:
0 1 * * * /usr/sbin/ntpdate -u ntp.api.bz
crantab -l //查看定時計劃
四、系統測試
1、網絡測試
使用gpcheckperf測試
創建只包含第一個網卡對應主機名的hostfile_net1文件和只包含第二張網卡對應主機名的hostfile_net2文件:
[root@master greenplum-db]# cat hostfile_net1
master
standby
seg-1
seg-2
seg-3
seg-4
[root@master greenplum-db]# cat hostfile_net2
eth2-master
eth2-standby
eth2-seg-1
eth2-seg-2
eth2-seg-3
eth2-seg-4
測試第一個網卡所在網段的網絡情況
[root@master greenplum-db]# gpcheckperf -f hostfile_net1 -r N -d /tmp > net1.out
[root@master greenplum-db]# cat net1.out
/usr/local/greenplum-db/./bin/gpcheckperf -f hostfile_net1 -r N -d /tmp
-------------------
-- NETPERF TEST
-------------------
[Warning] retrying with port 23012
====================
== RESULT
====================
Netperf bisection bandwidth test
master -> standby = 920.450000
seg-1 -> seg-2 = 837.850000
seg-3 -> seg-4 = 738.410000
standby -> master = 912.270000
seg-2 -> seg-1 = 900.320000
seg-4 -> seg-3 = 730.650000
Summary:
sum = 5039.95 MB/sec
min = 730.65 MB/sec
max = 920.45 MB/sec
avg = 839.99 MB/sec
median = 900.32 MB/sec
[Warning] connection between seg-3 and seg-4 is no good
[Warning] connection between seg-4 and seg-3 is no good
測試第二個網卡所在網段的網絡情況
gpcheckperf -f hostfile_net2 -r N -d /tmp > net2.out
注意:生產環境下要求網絡速率在 1GM 以上,速率越低性能越低。
2、I/O測試
使用gpcheckperf測試集羣主機的I/O性能。
注意:生產環境要求每臺機器的 I/O 性能均達到 1500M/S 以上的速率。
gpcheckperf -f hostfile_exkeys -r ds -D -d /data > diskio.out
五、創建Greenplum數據
1、創建數據庫初始化所需的hostfile
[root@master greenplum-db]# su - gpadmin
Last login: Tue Mar 31 17:33:38 CST 2020 on pts/0
[gpadmin@master ~]$ mkdir /home/gpadmin/gpconfigs
[gpadmin@master ~]$ cp /usr/local/greenplum-db/hostfile_* /home/gpadmin/gpconfigs/
[gpadmin@master ~]$ cd ./gpconfigs/
[gpadmin@master gpconfigs]$ ll
total 20
-rw-r--r-- 1 gpadmin gpadmin 108 Apr 1 13:41 hostfile_exkeys
-rw-r--r-- 1 gpadmin gpadmin 15 Apr 1 13:41 hostfile_masters
-rw-r--r-- 1 gpadmin gpadmin 39 Apr 1 13:41 hostfile_net1
-rw-r--r-- 1 gpadmin gpadmin 69 Apr 1 13:41 hostfile_net2
-rw-r--r-- 1 gpadmin gpadmin 24 Apr 1 13:41 hostfile_segments
[gpadmin@master gpconfigs]$ mv hostfile_exkeys hostfile_allnode
[gpadmin@master gpconfigs]$ ll
total 20
-rw-r--r-- 1 gpadmin gpadmin 108 Apr 1 13:41 hostfile_allnode
-rw-r--r-- 1 gpadmin gpadmin 15 Apr 1 13:41 hostfile_masters
-rw-r--r-- 1 gpadmin gpadmin 39 Apr 1 13:41 hostfile_net1
-rw-r--r-- 1 gpadmin gpadmin 69 Apr 1 13:41 hostfile_net2
-rw-r--r-- 1 gpadmin gpadmin 24 Apr 1 13:41 hostfile_segments
2、創建Greenplum數據庫配置文件
[gpadmin@master gpconfigs]$ source /usr/local/greenplum-db/greenplum_path.sh
[gpadmin@master gpconfigs]$ cp $GPHOME/docs/cli_help/gpconfigs/gpinitsystem_config /home/gpadmin/gpconfigs/
[gpadmin@master gpconfigs]$ ll
total 24
-rwxr-xr-x 1 gpadmin gpadmin 2663 Apr 1 13:48 gpinitsystem_config
-rw-r--r-- 1 gpadmin gpadmin 108 Apr 1 13:41 hostfile_allnode
-rw-r--r-- 1 gpadmin gpadmin 15 Apr 1 13:41 hostfile_masters
-rw-r--r-- 1 gpadmin gpadmin 39 Apr 1 13:41 hostfile_net1
-rw-r--r-- 1 gpadmin gpadmin 69 Apr 1 13:41 hostfile_net2
-rw-r--r-- 1 gpadmin gpadmin 24 Apr 1 13:41 hostfile_segments
配置gpinitsystem_config文件內容
[gpadmin@master gpconfigs]$ cat gpinitsystem_config
# FILE NAME: gpinitsystem_config
# Configuration file needed by the gpinitsystem
################################################
#### REQUIRED PARAMETERS
################################################
#### Name of this Greenplum system enclosed in quotes.
ARRAY_NAME="Greenplum Data Platform"
#### Naming convention for utility-generated data directories.
SEG_PREFIX=gpseg
#### Base number by which primary segment port numbers
#### are calculated.
PORT_BASE=6001
#### File system location(s) where primary segment data directories
#### will be created. The number of locations in the list dictate
#### the number of primary segments that will get created per
#### physical host (if multiple addresses for a host are listed in
#### the hostfile, the number of segments will be spread evenly across
#### the specified interface addresses).
declare -a DATA_DIRECTORY=(/data/primary /data/primary)
#### OS-configured hostname or IP address of the master host.
MASTER_HOSTNAME=master
#### File system location where the master data directory
#### will be created.
MASTER_DIRECTORY=/data/master
#### Port number for the master instance.
MASTER_PORT=5432
#### Shell utility used to connect to remote hosts.
TRUSTED_SHELL=ssh
#### Maximum log file segments between automatic WAL checkpoints.
CHECK_POINT_SEGMENTS=8
#### Default server-side character set encoding.
ENCODING=UNICODE
################################################
#### OPTIONAL MIRROR PARAMETERS
################################################
#### Base number by which mirror segment port numbers
#### are calculated.
MIRROR_PORT_BASE=21050
#### Base number by which primary file replication port
#### numbers are calculated.
REPLICATION_PORT_BASE=22050
#### Base number by which mirror file replication port
#### numbers are calculated.
MIRROR_REPLICATION_PORT_BASE=23050
#### File system location(s) where mirror segment data directories
#### will be created. The number of mirror locations must equal the
#### number of primary locations as specified in the
#### DATA_DIRECTORY parameter.
declare -a MIRROR_DATA_DIRECTORY=(/data/mirror /data/mirror)
################################################
#### OTHER OPTIONAL PARAMETERS
################################################
#### Create a database of this name after initialization.
DATABASE_NAME=mdb
#### Specify the location of the host address file here instead of
#### with the the -h option of gpinitsystem.
#MACHINE_LIST_FILE=/home/gpadmin/gpconfigs/hostfile_gpinitsystem
3、gp初始化
[gpadmin@master gpconfigs]$ gpinitsystem -c /home/gpadmin/gpconfigs/gpinitsystem_config -h /home/gpadmin/gpconfigs/hostfile_segments -s standby -S
初始化完成後設置gp環境變量
master、standby
[gpadmin@master ~]$ cat /home/gpadmin/.bashrc
# .bashrc
# Source global definitions
if [ -f /etc/bashrc ]; then
. /etc/bashrc
fi
# Uncomment the following line if you don't like systemctl's auto-paging feature:
# export SYSTEMD_PAGER=
# User specific aliases and functions
source /usr/local/greenplum-db/greenplum_path.sh
export MASTER_DATA_DIRECTORY=/data/master/gpseg-1
segments:
echo "source /usr/local/greenplum-db/greenplum_path.sh">> /home/gpadmin/.bashrc
4、註冊PL/python
操作系統默認已安裝了 Python,只需要在 gpadmin 用戶下執行以下命令,就可以在指 定的數據庫中使用 SQL 命令註冊 PL/Python 語言。
[gpadmin@master ~]$ createlang plpythonu -d mdb
如果要卸載已註冊的 PL/Python 語言,則執行下面的命令:
[gpadmin@master ~]$ droplang plpythonu -d mdb
5、安裝Oracle兼容函數
在 gpadmin 用戶下執行以下命令:
psql -d mdb -f /usr/local/greenplum-db/share/postgresql/contrib/orafunc.sql
如果要卸載該函數,則執行下面的命令:
psql -d mdb -f /usr/local/greenplum-db/share/postgresql/contrib/uninstall_orafunc.sql