greenplum6.14、GPCC6.4安裝詳解

最近在做gp的升級和整改,所以把做的內容整理下,這篇文章主要是基於gp6.14的安裝,主要分爲gp,gpcc,pxf的一些安裝和初始化。本文爲博客園作者所寫: 一寸HUI,個人博客地址:https://www.cnblogs.com/zsql/

一、安裝前準備

1.1、機器選型

參考:https://cn.greenplum.org/wp-content/uploads/2020/08/%E7%AC%AC%E4%B8%80%E8%8A%82%E8%AF%BE-%E8%85%BE%E8%AE%AF%E4%BA%91.pptx.pdf

1.2、網絡參數調整

echo "10000 65535" > /proc/sys/net/ipv4/ip_local_port_range #net.ipv4.ip_local_port_range,定義了地tcp/udp的端口範圍。可以理解爲系統中的程序會選擇這個範圍內的端口來連接到目的端口。
echo 1024 > /proc/sys/net/core/somaxconn #net.core.somaxconn,服務端所能accept即處理數據的最大客戶端數量,即完成連接上限。默認值是128,建議修改成1024。
echo 16777216 > /proc/sys/net/core/rmem_max #net.core.rmem_max,接收套接字緩衝區大小的最大值。默認值是229376,建議修改成16777216。
echo 16777216 > /proc/sys/net/core/wmem_max #net.core.wmem_max,發送套接字緩衝區大小的最大值(以字節爲單位)。默認值是229376,建議修改成16777216。
echo "4096 87380 16777216" > /proc/sys/net/ipv4/tcp_rmem #net.ipv4.tcp_rmem,配置讀緩衝的大小,三個值,第一個是這個讀緩衝的最小值,第三個是最大值,中間的是默認值。默認值是"4096 87380 6291456",建議修改成"4096 87380 16777216"echo "4096 65536 16777216" > /proc/sys/net/ipv4/tcp_wmem #net.ipv4.tcp_wmem,配置寫緩衝的大小,三個值,第一個是這個寫緩衝的最小值,第三個是最大值,中間的是默認值。默認值是"4096 16384 4194304",建議修改成"4096 65536 16777216"echo 360000 > /proc/sys/net/ipv4/tcp_max_tw_buckets #net.ipv4.max_tw_buckets,表示系統同時保持TIME_WAIT套接字的最大數量。默認值是2048,建議修改成360000

參考:https://support.huaweicloud.com/tngg-kunpengdbs/kunpenggreenplum_05_0011.html

1.3、磁盤和I/O調整

mount -o rw,nodev,noatime,nobarrier,inode64 /dev/dfa /data  #掛載
/sbin/blockdev --setra 16384 /dev/dfa #配置readhead,減少磁盤的尋道次數和應用程序的I/O等待時間,提升磁盤讀I/O性能
echo deadline > /sys/block/dfa/queue/scheduler #配置IO調度,deadline更適用於Greenplum數據庫場景
grubby --update-kernel=ALL --args="elevator=deadline"

vim /etc/security/limits.conf #配置文件描述符
#添加如下行

* soft nofile 65536
* hard nofile 65536
* soft nproc 131072
* hard nproc 131072

修改:/etc/security/limits.d/20-nproc.conf

#添加如下內容

*          soft    nproc     131072
root       soft    nproc     unlimited

 參考:http://docs-cn.greenplum.org/v6/best_practices/sysconfig.html

1.4、內核參數調整

vim  /etc/sysctl.conf  ,設置完成後 重載參數( sysctl -p),注意:如下的內容根據內存的實際情況進行修改

# kernel.shmall = _PHYS_PAGES / 2 # See Shared Memory Pages  # 共享內存
kernel.shmall = 4000000000
# kernel.shmmax = kernel.shmall * PAGE_SIZE                  # 共享內存
kernel.shmmax = 500000000
kernel.shmmni = 4096
vm.overcommit_memory = 2 # See Segment Host Memory           # 主機內存
vm.overcommit_ratio = 95 # See Segment Host Memory           # 主機內存

net.ipv4.ip_local_port_range = 10000 65535 # See Port Settings 端口設定
kernel.sem = 500 2048000 200 40960
kernel.sysrq = 1
kernel.core_uses_pid = 1
kernel.msgmnb = 65536
kernel.msgmax = 65536
kernel.msgmni = 2048
net.ipv4.tcp_syncookies = 1
net.ipv4.conf.default.accept_source_route = 0
net.ipv4.tcp_max_syn_backlog = 4096
net.ipv4.conf.all.arp_filter = 1
net.core.netdev_max_backlog = 10000
net.core.rmem_max = 2097152
net.core.wmem_max = 2097152
vm.swappiness = 10
vm.zone_reclaim_mode = 0
vm.dirty_expire_centisecs = 500
vm.dirty_writeback_centisecs = 100
vm.dirty_background_ratio = 0 # See System Memory       # 系統內存
vm.dirty_ratio = 0
vm.dirty_background_bytes = 1610612736
vm.dirty_bytes = 4294967296

部分參數相關解釋和計算(注意:還是根據實際情況進行計算):

kernel.shmall = _PHYS_PAGES / 2   #echo $(expr $(getconf _PHYS_PAGES) / 2)
kernel.shmmax = kernel.shmall * PAGE_SIZE  #echo $(expr $(getconf _PHYS_PAGES) / 2 \* $(getconf PAGE_SIZE))
vm.overcommit_memory 系統使用該參數來確定可以爲進程分配多少內存。對於GP數據庫,此參數應設置爲2。
vm.overcommit_ratio 以爲進程分配內的百分比,其餘部分留給操作系統。在Red Hat上,默認值爲50。建議設置95 #vm.overcommit_ratio = (RAM-0.026*gp_vmem) / RAM

爲避免在Greenplum初始化期間與其他應用程序之間的端口衝突,指定的端口範圍 net.ipv4.ip_local_port_range。
使用gpinitsystem初始化Greenplum時,請不要在該範圍內指定Greenplum數據庫端口。
例如,如果net.ipv4.ip_local_port_range = 10000 65535,將Greenplum數據庫基本端口號設置爲這些值。
PORT_BASE = 6000
MIRROR_PORT_BASE = 7000

系統內存大於64G ,建議以下配置
vm.dirty_background_ratio = 0
vm.dirty_ratio = 0
vm.dirty_background_bytes = 1610612736 # 1.5GB
vm.dirty_bytes = 4294967296 # 4GB

1.5、java安裝

mkdir /usr/local/java
tar –zxvf  jdk-8u181-linux-x64.tar.gz –C /usr/local/java
cd /usr/local/java
ln -s jdk1.8.0_181/ jdk

配置環境變量:

配置環境變量:vim /etc/profile #添加如下內容
export JAVA_HOME=/usr/local/java/jdk
export CLASSPATH=.:/lib/dt.jar:/lib/tools.jar
export PATH=$JAVA_HOME/bin:$PATH

然後source /etc/profile,然後使用java,javac命令檢查是否安裝好

1.6、防火牆和SElinux(Security-Enhanced Linux)檢查

關閉防火牆:

systemctl stop firewalld.service
systemctl disable firewalld.service
systemctl status firewalld

禁用SElinux

修改 /etc/selinux/config 設置 #修改如下內容
SELINUX=disabled

1.7、其他配置

禁用THP:grubby --update-kernel=ALL --args="transparent_hugepage=never" #添加參數後,重啓系統
設置RemoveIPC
vim /etc/systemd/logind.conf
RemoveIPC=no

1.8、配置hosts文件

vim /etc/hosts  #把所有的ip  hostname配置進去

1.9、新建gpadmin用戶和用戶組並配置免密

新建用戶和用戶組(每個節點):

groupdel gpadmin
userdel gpadmin
groupadd gpadmin
useradd  -g gpadmin  -d /apps/gpadmin gpadmin

passwd gpadmin
或者echo 'ow@R99d7' | passwd gpadmin --stdin

配置免密:

ssh-keygen -t rsa

ssh-copy-id lgh1
ssh-copy-id lgh2
ssh-copy-id lgh3

 

二、Greenplum安裝

2.1、安裝

下載地址:https://network.pivotal.io/products/pivotal-gpdb#/releases/837931  #6.14.1版本

安裝:yum install -y greenplum-db-6.14.1-rhel7-x86_64.rpm  #默認安裝在/usr/local目錄下,安裝greenpum需要很多依賴,但是通過這樣安裝默認自動安裝需要的依賴。

 安裝完畢後會出現目錄:/usr/local/greenplum-db

chown -R gpadmin:gpadmin /usr/local/greenplum*

 查看安裝的目錄:

[root@lgh1 local]# cd /usr/local/greenplum-db
[root@lgh1 greenplum-db]# ll
total 5548
drwxr-xr-x 7 root root    4096 Mar 30 16:44 bin
drwxr-xr-x 3 root root      21 Mar 30 16:44 docs
drwxr-xr-x 2 root root      58 Mar 30 16:44 etc
drwxr-xr-x 3 root root      19 Mar 30 16:44 ext
-rwxr-xr-x 1 root root     724 Mar 30 16:44 greenplum_path.sh
drwxr-xr-x 5 root root    4096 Mar 30 16:44 include
drwxr-xr-x 6 root root    4096 Mar 30 16:44 lib
drwxr-xr-x 2 root root      20 Mar 30 16:44 libexec
-rw-r--r-- 1 root root 5133247 Feb 23 03:17 open_source_licenses.txt
-rw-r--r-- 1 root root  519852 Feb 23 03:17 open_source_license_VMware_Tanzu_Greenplum_Streaming_Server_1.5.0_GA.txt
drwxr-xr-x 7 root root     135 Mar 30 16:44 pxf
drwxr-xr-x 2 root root    4096 Mar 30 16:44 sbin
drwxr-xr-x 5 root root      49 Mar 30 16:44 share

2.2、創建集羣相關主機文件

在$GPHOME目錄創建兩個host文件(all_host,seg_host),用於後續使用gpssh,gpscp 等腳本host參數文件
all_host : 內容是集羣所有主機名或ip,包含master,segment,standby等。
seg_host: 內容是所有 segment主機名或ip

2.3、配置環境變量

vim  ~/.bashrc
source /usr/local/greenplum-db/greenplum_path.sh
export MASTER_DATA_DIRECTORY=/apps/data1/master/gpseg-1
export PGPORT=5432
#然後使其生效
source ~/.bashrc

2.4、gpssh-exkeys 工具,打通n-n的免密登陸

 #gpssh-exkeys -f all_hosts

  2.5、驗證gpssh

#gpssh -f all_hosts

  2.6、同步master配置到各個主機

這裏主要是同步:/usr/local/greenplum-db文件到其他的主機上,不管使用什麼方式,只要最終在每個機器上的狀態如下就好:ll /usr/local

 

 可以使用scp,也可以使用gp的方式(需要root賬號)注意:這裏會涉及到權限的問題,所以使用root就好,最後通過chown去修改爲gpadmin即可,下面的方式應該是不使用,除非每個主機的/usr/lcoal目錄的權限爲gpadmin,

打包:cd /usr/local &&  tar -cf /usr/local/gp6.tar greenplum-db-6.14.1
分發:gpscp -f /apps/gpadmin/conf/all_hosts /usr/local/gp6.tar =:/usr/local
然後解壓,並建立軟鏈接
gpssh -f /apps/gpadmin/conf/all_hosts
cd /usr/local
tar -xf gp6.tar
ln -s greenplum-db-6.14.1 greenplum-db
chown -R gpadmin:gpadmin /usr/local/greenplum*

2.7、創建master和segment相關文件

#創建mster文件
mkdir /data/data1/master
chown -R gpadmin:gpadmin /data/data1/master

#創建segment文件
gpssh -f /apps/gpadmin/conf/seg_hosts -e 'mkdir -p /data/data1/primary'
gpssh -f /apps/gpadmin/conf/seg_hosts -e 'mkdir -p /data/data1/mirror'
gpssh -f /apps/gpadmin/conf/seg_hosts -e 'chown -R gpadmin:gpadmin /data/data1/'

不管使用什麼方式,也可以使用CRT批量創建,保證每個主機的目錄相同,所有的目錄的權限爲gpadmin:gpadmin即可,

2.8、初始化集羣

創建文件夾,並複製配置文件:

mkdir -p ~/gpconfigs
cp $GPHOME/docs/cli_help/gpconfigs/gpinitsystem_config ~/gpconfigs/gpinitsystem_config

然後修改~/gpconfigs/gpinitsystem_config文件:

[gpadmin@lgh1 gpconfigs]$ cat gpinitsystem_config  | egrep -v '^$|#'
ARRAY_NAME="Greenplum Data Platform"
SEG_PREFIX=gpseg
PORT_BASE=6000
declare -a DATA_DIRECTORY=(/apps/data1/primary)
MASTER_HOSTNAME=master_hostname
MASTER_DIRECTORY=/apps/data1/master
MASTER_PORT=5432
TRUSTED_SHELL=ssh
CHECK_POINT_SEGMENTS=8
ENCODING=UNICODE
MIRROR_PORT_BASE=7000
declare -a MIRROR_DATA_DIRECTORY=(/apps/data1/mirror)
DATABASE_NAME=gp_test

使用如下命令進行初始化:

gpinitsystem -c ~/gpconfigs/gpinitsystem_config -h  ~/conf/seg_hosts

運行結果如下:

[gpadmin@lgh1 ~]$ gpinitsystem -c ~/gpconfigs/gpinitsystem_config -h  ~/conf/seg_hosts
20210330:17:37:56:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Checking configuration parameters, please wait...
20210330:17:37:56:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Reading Greenplum configuration file /apps/gpadmin/gpconfigs/gpinitsystem_config
20210330:17:37:56:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Locale has not been set in /apps/gpadmin/gpconfigs/gpinitsystem_config, will set to default value
20210330:17:37:56:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Locale set to en_US.utf8
20210330:17:37:56:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-MASTER_MAX_CONNECT not set, will set to default value 250
20210330:17:37:56:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Checking configuration parameters, Completed
20210330:17:37:56:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Commencing multi-home checks, please wait...
..
20210330:17:37:56:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Configuring build for standard array
20210330:17:37:56:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Commencing multi-home checks, Completed
20210330:17:37:56:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Building primary segment instance array, please wait...
..
20210330:17:37:57:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Building group mirror array type , please wait...
..
20210330:17:37:58:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Checking Master host
20210330:17:37:58:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Checking new segment hosts, please wait...
....
20210330:17:38:02:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Checking new segment hosts, Completed
20210330:17:38:02:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Greenplum Database Creation Parameters
20210330:17:38:02:011255 gpinitsystem:lgh1:gpadmin-[INFO]:---------------------------------------
20210330:17:38:02:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Master Configuration
20210330:17:38:02:011255 gpinitsystem:lgh1:gpadmin-[INFO]:---------------------------------------
20210330:17:38:02:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Master instance name       = Greenplum Data Platform
20210330:17:38:02:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Master hostname            = lgh1
20210330:17:38:02:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Master port                = 5432
20210330:17:38:02:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Master instance dir        = /apps/data1/master/gpseg-1
20210330:17:38:02:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Master LOCALE              = en_US.utf8
20210330:17:38:02:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Greenplum segment prefix   = gpseg
20210330:17:38:02:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Master Database            = gpebusiness
20210330:17:38:02:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Master connections         = 250
20210330:17:38:02:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Master buffers             = 128000kB
20210330:17:38:02:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Segment connections        = 750
20210330:17:38:02:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Segment buffers            = 128000kB
20210330:17:38:02:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Checkpoint segments        = 8
20210330:17:38:02:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Encoding                   = UNICODE
20210330:17:38:02:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Postgres param file        = Off
20210330:17:38:02:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Initdb to be used          = /usr/local/greenplum-db-6.14.1/bin/initdb
20210330:17:38:02:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-GP_LIBRARY_PATH is         = /usr/local/greenplum-db-6.14.1/lib
20210330:17:38:02:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-HEAP_CHECKSUM is           = on
20210330:17:38:02:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-HBA_HOSTNAMES is           = 0
20210330:17:38:02:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Ulimit check               = Passed
20210330:17:38:02:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Array host connect type    = Single hostname per node
20210330:17:38:02:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Master IP address [1]      = 10.18.43.86
20210330:17:38:02:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Standby Master             = Not Configured
20210330:17:38:02:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Number of primary segments = 1
20210330:17:38:02:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Total Database segments    = 2
20210330:17:38:02:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Trusted shell              = ssh
20210330:17:38:02:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Number segment hosts       = 2
20210330:17:38:02:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Mirror port base           = 7000
20210330:17:38:02:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Number of mirror segments  = 1
20210330:17:38:02:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Mirroring config           = ON
20210330:17:38:02:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Mirroring type             = Group
20210330:17:38:02:011255 gpinitsystem:lgh1:gpadmin-[INFO]:----------------------------------------
20210330:17:38:02:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Greenplum Primary Segment Configuration
20210330:17:38:02:011255 gpinitsystem:lgh1:gpadmin-[INFO]:----------------------------------------
20210330:17:38:02:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-lgh2       6000    lgh2       /apps/data1/primary/gpseg0      2
20210330:17:38:02:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-lgh3       6000    lgh3       /apps/data1/primary/gpseg1      3
20210330:17:38:02:011255 gpinitsystem:lgh1:gpadmin-[INFO]:---------------------------------------
20210330:17:38:02:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Greenplum Mirror Segment Configuration
20210330:17:38:02:011255 gpinitsystem:lgh1:gpadmin-[INFO]:---------------------------------------
20210330:17:38:02:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-lgh3       7000    lgh3       /apps/data1/mirror/gpseg0       4
20210330:17:38:02:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-lgh2       7000    lgh2       /apps/data1/mirror/gpseg1       5

Continue with Greenplum creation Yy|Nn (default=N):
> y
20210330:17:38:05:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Building the Master instance database, please wait...
20210330:17:38:09:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Starting the Master in admin mode
20210330:17:38:10:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Commencing parallel build of primary segment instances
20210330:17:38:10:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Spawning parallel processes    batch [1], please wait...
..
20210330:17:38:10:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Waiting for parallel processes batch [1], please wait...
.........
20210330:17:38:19:011255 gpinitsystem:lgh1:gpadmin-[INFO]:------------------------------------------------
20210330:17:38:19:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Parallel process exit status
20210330:17:38:19:011255 gpinitsystem:lgh1:gpadmin-[INFO]:------------------------------------------------
20210330:17:38:19:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Total processes marked as completed           = 2
20210330:17:38:19:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Total processes marked as killed              = 0
20210330:17:38:19:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Total processes marked as failed              = 0
20210330:17:38:19:011255 gpinitsystem:lgh1:gpadmin-[INFO]:------------------------------------------------
20210330:17:38:19:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Deleting distributed backout files
20210330:17:38:19:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Removing back out file
20210330:17:38:19:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-No errors generated from parallel processes
20210330:17:38:19:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Restarting the Greenplum instance in production mode
20210330:17:38:20:013281 gpstop:lgh1:gpadmin-[INFO]:-Starting gpstop with args: -a -l /apps/gpadmin/gpAdminLogs -m -d /apps/data1/master/gpseg-1
20210330:17:38:20:013281 gpstop:lgh1:gpadmin-[INFO]:-Gathering information and validating the environment...
20210330:17:38:20:013281 gpstop:lgh1:gpadmin-[INFO]:-Obtaining Greenplum Master catalog information
20210330:17:38:20:013281 gpstop:lgh1:gpadmin-[INFO]:-Obtaining Segment details from master...
20210330:17:38:20:013281 gpstop:lgh1:gpadmin-[INFO]:-Greenplum Version: 'postgres (Greenplum Database) 6.14.1 build commit:5ef30dd4c9878abadc0124e0761e4b988455a4bd'
20210330:17:38:20:013281 gpstop:lgh1:gpadmin-[INFO]:-Commencing Master instance shutdown with mode='smart'
20210330:17:38:20:013281 gpstop:lgh1:gpadmin-[INFO]:-Master segment instance directory=/apps/data1/master/gpseg-1
20210330:17:38:20:013281 gpstop:lgh1:gpadmin-[INFO]:-Stopping master segment and waiting for user connections to finish ...
server shutting down
20210330:17:38:21:013281 gpstop:lgh1:gpadmin-[INFO]:-Attempting forceful termination of any leftover master process
20210330:17:38:21:013281 gpstop:lgh1:gpadmin-[INFO]:-Terminating processes for segment /apps/data1/master/gpseg-1
20210330:17:38:21:013307 gpstart:lgh1:gpadmin-[INFO]:-Starting gpstart with args: -a -l /apps/gpadmin/gpAdminLogs -d /apps/data1/master/gpseg-1
20210330:17:38:21:013307 gpstart:lgh1:gpadmin-[INFO]:-Gathering information and validating the environment...
20210330:17:38:21:013307 gpstart:lgh1:gpadmin-[INFO]:-Greenplum Binary Version: 'postgres (Greenplum Database) 6.14.1 build commit:5ef30dd4c9878abadc0124e0761e4b988455a4bd'
20210330:17:38:21:013307 gpstart:lgh1:gpadmin-[INFO]:-Greenplum Catalog Version: '301908232'
20210330:17:38:21:013307 gpstart:lgh1:gpadmin-[INFO]:-Starting Master instance in admin mode
20210330:17:38:21:013307 gpstart:lgh1:gpadmin-[INFO]:-Obtaining Greenplum Master catalog information
20210330:17:38:21:013307 gpstart:lgh1:gpadmin-[INFO]:-Obtaining Segment details from master...
20210330:17:38:21:013307 gpstart:lgh1:gpadmin-[INFO]:-Setting new master era
20210330:17:38:21:013307 gpstart:lgh1:gpadmin-[INFO]:-Master Started...
20210330:17:38:21:013307 gpstart:lgh1:gpadmin-[INFO]:-Shutting down master
20210330:17:38:22:013307 gpstart:lgh1:gpadmin-[INFO]:-Commencing parallel segment instance startup, please wait...
20210330:17:38:22:013307 gpstart:lgh1:gpadmin-[INFO]:-Process results...
20210330:17:38:22:013307 gpstart:lgh1:gpadmin-[INFO]:-----------------------------------------------------
20210330:17:38:22:013307 gpstart:lgh1:gpadmin-[INFO]:-   Successful segment starts                                            = 2
20210330:17:38:22:013307 gpstart:lgh1:gpadmin-[INFO]:-   Failed segment starts                                                = 0
20210330:17:38:22:013307 gpstart:lgh1:gpadmin-[INFO]:-   Skipped segment starts (segments are marked down in configuration)   = 0
20210330:17:38:22:013307 gpstart:lgh1:gpadmin-[INFO]:-----------------------------------------------------
20210330:17:38:22:013307 gpstart:lgh1:gpadmin-[INFO]:-Successfully started 2 of 2 segment instances
20210330:17:38:22:013307 gpstart:lgh1:gpadmin-[INFO]:-----------------------------------------------------
20210330:17:38:22:013307 gpstart:lgh1:gpadmin-[INFO]:-Starting Master instance lgh1 directory /apps/data1/master/gpseg-1
20210330:17:38:23:013307 gpstart:lgh1:gpadmin-[INFO]:-Command pg_ctl reports Master lgh1 instance active
20210330:17:38:23:013307 gpstart:lgh1:gpadmin-[INFO]:-Connecting to dbname='template1' connect_timeout=15
20210330:17:38:23:013307 gpstart:lgh1:gpadmin-[INFO]:-No standby master configured.  skipping...
20210330:17:38:23:013307 gpstart:lgh1:gpadmin-[INFO]:-Database successfully started
20210330:17:38:23:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Completed restart of Greenplum instance in production mode
20210330:17:38:23:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Commencing parallel build of mirror segment instances
20210330:17:38:23:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Spawning parallel processes    batch [1], please wait...
..
20210330:17:38:23:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Waiting for parallel processes batch [1], please wait...
..
20210330:17:38:26:011255 gpinitsystem:lgh1:gpadmin-[INFO]:------------------------------------------------
20210330:17:38:26:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Parallel process exit status
20210330:17:38:26:011255 gpinitsystem:lgh1:gpadmin-[INFO]:------------------------------------------------
20210330:17:38:26:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Total processes marked as completed           = 2
20210330:17:38:26:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Total processes marked as killed              = 0
20210330:17:38:26:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Total processes marked as failed              = 0
20210330:17:38:26:011255 gpinitsystem:lgh1:gpadmin-[INFO]:------------------------------------------------
20210330:17:38:26:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Scanning utility log file for any warning messages
20210330:17:38:26:011255 gpinitsystem:lgh1:gpadmin-[WARN]:-*******************************************************
20210330:17:38:26:011255 gpinitsystem:lgh1:gpadmin-[WARN]:-Scan of log file indicates that some warnings or errors
20210330:17:38:26:011255 gpinitsystem:lgh1:gpadmin-[WARN]:-were generated during the array creation
20210330:17:38:26:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Please review contents of log file
20210330:17:38:26:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-/apps/gpadmin/gpAdminLogs/gpinitsystem_20210330.log
20210330:17:38:26:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-To determine level of criticality
20210330:17:38:26:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-These messages could be from a previous run of the utility
20210330:17:38:26:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-that was called today!
20210330:17:38:26:011255 gpinitsystem:lgh1:gpadmin-[WARN]:-*******************************************************
20210330:17:38:26:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Greenplum Database instance successfully created
20210330:17:38:26:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-------------------------------------------------------
20210330:17:38:26:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-To complete the environment configuration, please
20210330:17:38:26:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-update gpadmin .bashrc file with the following
20210330:17:38:26:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-1. Ensure that the greenplum_path.sh file is sourced
20210330:17:38:26:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-2. Add "export MASTER_DATA_DIRECTORY=/apps/data1/master/gpseg-1"
20210330:17:38:26:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-   to access the Greenplum scripts for this instance:
20210330:17:38:26:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-   or, use -d /apps/data1/master/gpseg-1 option for the Greenplum scripts
20210330:17:38:26:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-   Example gpstate -d /apps/data1/master/gpseg-1
20210330:17:38:26:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Script log file = /apps/gpadmin/gpAdminLogs/gpinitsystem_20210330.log
20210330:17:38:26:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-To remove instance, run gpdeletesystem utility
20210330:17:38:26:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-To initialize a Standby Master Segment for this Greenplum instance
20210330:17:38:26:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Review options for gpinitstandby
20210330:17:38:26:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-------------------------------------------------------
20210330:17:38:26:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-The Master /apps/data1/master/gpseg-1/pg_hba.conf post gpinitsystem
20210330:17:38:26:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-has been configured to allow all hosts within this new
20210330:17:38:26:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-array to intercommunicate. Any hosts external to this
20210330:17:38:26:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-new array must be explicitly added to this file
20210330:17:38:26:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-Refer to the Greenplum Admin support guide which is
20210330:17:38:26:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-located in the /usr/local/greenplum-db-6.14.1/docs directory
20210330:17:38:26:011255 gpinitsystem:lgh1:gpadmin-[INFO]:-------------------------------------------------------
View Code

到這裏就算安裝完了,但是這裏需要說一下就是mirror的分佈方式,這裏使用了默認的方式:Group Mirroring,在初始化的時候可以指定其他的方式,其他的分佈方式要求稍微高一些,也複雜點,可以參考:http://docs-cn.greenplum.org/v6/best_practices/ha.html

具體在初始化的時候如何指定mirror的分佈方式,以及在初始化的安裝master standby,可參考:gpinitsystem命令=》網址:http://docs-cn.greenplum.org/v6/utility_guide/admin_utilities/gpinitsystem.html#topic1

2.9、查看集羣狀態

#gpstate  #命令見:http://docs-cn.greenplum.org/v6/utility_guide/admin_utilities/gpstate.html

[gpadmin@lgh1 ~]$ gpstate
20210330:17:41:32:013986 gpstate:lgh1:gpadmin-[INFO]:-Starting gpstate with args:
20210330:17:41:32:013986 gpstate:lgh1:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 6.14.1 build commit:5ef30dd4c9878abadc0124e0761e4b988455a4bd'
20210330:17:41:32:013986 gpstate:lgh1:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 9.4.24 (Greenplum Database 6.14.1 build commit:5ef30dd4c9878abadc0124e0761e4b988455a4bd) on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 6.4.0, 64-bit compiled on Feb 22 2021 18:27:08'
20210330:17:41:32:013986 gpstate:lgh1:gpadmin-[INFO]:-Obtaining Segment details from master...
20210330:17:41:32:013986 gpstate:lgh1:gpadmin-[INFO]:-Gathering data from segments...
20210330:17:41:32:013986 gpstate:lgh1:gpadmin-[INFO]:-Greenplum instance status summary
20210330:17:41:32:013986 gpstate:lgh1:gpadmin-[INFO]:-----------------------------------------------------
20210330:17:41:32:013986 gpstate:lgh1:gpadmin-[INFO]:-   Master instance                                           = Active
20210330:17:41:32:013986 gpstate:lgh1:gpadmin-[INFO]:-   Master standby                                            = No master standby configured
20210330:17:41:32:013986 gpstate:lgh1:gpadmin-[INFO]:-   Total segment instance count from metadata                = 4
20210330:17:41:32:013986 gpstate:lgh1:gpadmin-[INFO]:-----------------------------------------------------
20210330:17:41:32:013986 gpstate:lgh1:gpadmin-[INFO]:-   Primary Segment Status
20210330:17:41:32:013986 gpstate:lgh1:gpadmin-[INFO]:-----------------------------------------------------
20210330:17:41:32:013986 gpstate:lgh1:gpadmin-[INFO]:-   Total primary segments                                    = 2
20210330:17:41:32:013986 gpstate:lgh1:gpadmin-[INFO]:-   Total primary segment valid (at master)                   = 2
20210330:17:41:32:013986 gpstate:lgh1:gpadmin-[INFO]:-   Total primary segment failures (at master)                = 0
20210330:17:41:32:013986 gpstate:lgh1:gpadmin-[INFO]:-   Total number of postmaster.pid files missing              = 0
20210330:17:41:32:013986 gpstate:lgh1:gpadmin-[INFO]:-   Total number of postmaster.pid files found                = 2
20210330:17:41:32:013986 gpstate:lgh1:gpadmin-[INFO]:-   Total number of postmaster.pid PIDs missing               = 0
20210330:17:41:32:013986 gpstate:lgh1:gpadmin-[INFO]:-   Total number of postmaster.pid PIDs found                 = 2
20210330:17:41:32:013986 gpstate:lgh1:gpadmin-[INFO]:-   Total number of /tmp lock files missing                   = 0
20210330:17:41:32:013986 gpstate:lgh1:gpadmin-[INFO]:-   Total number of /tmp lock files found                     = 2
20210330:17:41:32:013986 gpstate:lgh1:gpadmin-[INFO]:-   Total number postmaster processes missing                 = 0
20210330:17:41:32:013986 gpstate:lgh1:gpadmin-[INFO]:-   Total number postmaster processes found                   = 2
20210330:17:41:32:013986 gpstate:lgh1:gpadmin-[INFO]:-----------------------------------------------------
20210330:17:41:32:013986 gpstate:lgh1:gpadmin-[INFO]:-   Mirror Segment Status
20210330:17:41:32:013986 gpstate:lgh1:gpadmin-[INFO]:-----------------------------------------------------
20210330:17:41:32:013986 gpstate:lgh1:gpadmin-[INFO]:-   Total mirror segments                                     = 2
20210330:17:41:32:013986 gpstate:lgh1:gpadmin-[INFO]:-   Total mirror segment valid (at master)                    = 2
20210330:17:41:32:013986 gpstate:lgh1:gpadmin-[INFO]:-   Total mirror segment failures (at master)                 = 0
20210330:17:41:32:013986 gpstate:lgh1:gpadmin-[INFO]:-   Total number of postmaster.pid files missing              = 0
20210330:17:41:32:013986 gpstate:lgh1:gpadmin-[INFO]:-   Total number of postmaster.pid files found                = 2
20210330:17:41:32:013986 gpstate:lgh1:gpadmin-[INFO]:-   Total number of postmaster.pid PIDs missing               = 0
20210330:17:41:32:013986 gpstate:lgh1:gpadmin-[INFO]:-   Total number of postmaster.pid PIDs found                 = 2
20210330:17:41:32:013986 gpstate:lgh1:gpadmin-[INFO]:-   Total number of /tmp lock files missing                   = 0
20210330:17:41:32:013986 gpstate:lgh1:gpadmin-[INFO]:-   Total number of /tmp lock files found                     = 2
20210330:17:41:32:013986 gpstate:lgh1:gpadmin-[INFO]:-   Total number postmaster processes missing                 = 0
20210330:17:41:32:013986 gpstate:lgh1:gpadmin-[INFO]:-   Total number postmaster processes found                   = 2
20210330:17:41:32:013986 gpstate:lgh1:gpadmin-[INFO]:-   Total number mirror segments acting as primary segments   = 0
20210330:17:41:32:013986 gpstate:lgh1:gpadmin-[INFO]:-   Total number mirror segments acting as mirror segments    = 2
20210330:17:41:32:013986 gpstate:lgh1:gpadmin-[INFO]:-----------------------------------------------------

2.10、安裝master  standby

segment的mirror,以及master的standby都可以在初始化的時候安裝,可以在安裝好了手動添加mirror和master standby,mirror的手動添加參考:http://docs-cn.greenplum.org/v6/utility_guide/admin_utilities/gpaddmirrors.html

這裏說下master standby的添加:參考:http://docs-cn.greenplum.org/v6/utility_guide/admin_utilities/gpinitstandby.html

找一臺機器,創建目錄:

mkdir /apps/data1/master
chown -R gpadmin:gpadmin /apps/data1/master

安裝master standby:gpinitstandby -s lgh2

[gpadmin@lgh1 ~]$ gpinitstandby -s lgh2
20210330:17:58:39:014332 gpinitstandby:lgh1:gpadmin-[INFO]:-Validating environment and parameters for standby initialization...
20210330:17:58:39:014332 gpinitstandby:lgh1:gpadmin-[INFO]:-Checking for data directory /apps/data1/master/gpseg-1 on lgh2
20210330:17:58:40:014332 gpinitstandby:lgh1:gpadmin-[INFO]:------------------------------------------------------
20210330:17:58:40:014332 gpinitstandby:lgh1:gpadmin-[INFO]:-Greenplum standby master initialization parameters
20210330:17:58:40:014332 gpinitstandby:lgh1:gpadmin-[INFO]:------------------------------------------------------
20210330:17:58:40:014332 gpinitstandby:lgh1:gpadmin-[INFO]:-Greenplum master hostname               = lgh1
20210330:17:58:40:014332 gpinitstandby:lgh1:gpadmin-[INFO]:-Greenplum master data directory         = /apps/data1/master/gpseg-1
20210330:17:58:40:014332 gpinitstandby:lgh1:gpadmin-[INFO]:-Greenplum master port                   = 5432
20210330:17:58:40:014332 gpinitstandby:lgh1:gpadmin-[INFO]:-Greenplum standby master hostname       = lgh2
20210330:17:58:40:014332 gpinitstandby:lgh1:gpadmin-[INFO]:-Greenplum standby master port           = 5432
20210330:17:58:40:014332 gpinitstandby:lgh1:gpadmin-[INFO]:-Greenplum standby master data directory = /apps/data1/master/gpseg-1
20210330:17:58:40:014332 gpinitstandby:lgh1:gpadmin-[INFO]:-Greenplum update system catalog         = On
Do you want to continue with standby master initialization? Yy|Nn (default=N):
> y
20210330:17:58:41:014332 gpinitstandby:lgh1:gpadmin-[INFO]:-Syncing Greenplum Database extensions to standby
20210330:17:58:42:014332 gpinitstandby:lgh1:gpadmin-[INFO]:-The packages on lgh2 are consistent.
20210330:17:58:42:014332 gpinitstandby:lgh1:gpadmin-[INFO]:-Adding standby master to catalog...
20210330:17:58:42:014332 gpinitstandby:lgh1:gpadmin-[INFO]:-Database catalog updated successfully.
20210330:17:58:42:014332 gpinitstandby:lgh1:gpadmin-[INFO]:-Updating pg_hba.conf file...
20210330:17:58:42:014332 gpinitstandby:lgh1:gpadmin-[INFO]:-pg_hba.conf files updated successfully.
20210330:17:58:43:014332 gpinitstandby:lgh1:gpadmin-[INFO]:-Starting standby master
20210330:17:58:43:014332 gpinitstandby:lgh1:gpadmin-[INFO]:-Checking if standby master is running on host: lgh2  in directory: /apps/data1/master/gpseg-1
20210330:17:58:44:014332 gpinitstandby:lgh1:gpadmin-[INFO]:-Cleaning up pg_hba.conf backup files...
20210330:17:58:44:014332 gpinitstandby:lgh1:gpadmin-[INFO]:-Backup files of pg_hba.conf cleaned up successfully.
20210330:17:58:44:014332 gpinitstandby:lgh1:gpadmin-[INFO]:-Successfully created standby master on lgh2

 

三、GPCC安裝

下載地址:https://network.pivotal.io/products/gpdb-command-center/#/releases/829440

參考文檔:https://gpcc.docs.pivotal.io/640/welcome.html

 

 3.1、確認安裝環境

數據庫狀態:確認greenplum數據庫已被安裝並正在運行:gpstate
環境變量:環境變量MASTER_DATA_DIRECTORY已設置並生效:echo $ MASTER_DATA_DIRECTORY
/usr/local/權限:確認gpadmin有/usr/local/的寫權限:ls –ltr /usr
28080端口:28080爲默認的web客戶端口,確認未被佔用:netstat -aptn| grep 28080
8899端口:8899爲RPC端口,確認未被佔用:netstat -aptn| grep 8899
依賴包apr-util:確認依賴包apr-util已被安裝:rpm –q apr-util

GPCC默認安裝在/usr/local目錄下面,然而gpadmin默認是沒有權限的,所以可以創建一個目錄授權給gpadmin,然後在安裝的時候指定該目錄,我這裏比較偷懶,選擇如下方式:注意:每一臺機器都要一樣的操作,這種方式如果還有其他用戶的目錄均要還原

chown -R gpadmin:gpadmin /usr/local
安裝完成後,執行如下操作:
chown -R root:root /usr/local
chown -R gpadmin:gpadmin /usr/local/greenplum*

3.2、安裝

安裝官網:https://gpcc.docs.pivotal.io/640/topics/install.html

安裝方式的選擇,總共有四種方式的安裝,這裏選擇交互式的安裝方式:

 

 解壓:

unzip greenplum-cc-web-6.4.0-gp6-rhel7-x86_64.zip
ln -s greenplum-cc-web-6.4.0-gp6-rhel7-x86_64 greenplum-cc
chown -R gpadmin:gpadmin greenplum-cc*

切換到gpadmin用戶,然後使用-W安裝:

su - gpadmin
cd greenplum-cc
./gpccinstall-6.4.0  -W  #這裏設置密碼

 

 輸入密碼後,一直按空格鍵,指導同意安裝協議:

 

 3.3、配置環境變量

vim ~/.bashrc  #添加如下內容,然後source ~/.bashrc

source /usr/local/greenplum-cc/gpcc_path.sh

3.4、驗證安裝是否成功

然後使用測試:gpcc status

 

 然後看下在查看配置的密碼文件:ll -a ~   #沒有找到隱藏文件 .pgpass文件,所以就手動創建一個這樣的文件

vim  ~/.pgpass
#在裏面添加如下內容:最後一個字段爲安裝gpcc時輸入的密碼
*:5432:gpperfmon:gpmon:1qaz@WSX

然後授權:

chown gpadmin:gpadmin ~/.pgpass
chmod 600 ~/.pgpass

也可以重新修改密碼:參考:https://gpcc.docs.pivotal.io/640/topics/gpmon.html

注意:這裏需要把.pgpass文件拷貝master standby 主機的gpadmin的home目錄下

然後查看狀態,並啓動:

 

 然後登陸最後的url,使用gpmon用戶和密碼登錄:

 

 

 

 到此配置GPCC就安裝完畢了

四、pxf初始化和基本配置

4.1、安裝環境確認

使用gpstate確認GP是否啓動
確認java是否安裝
確認/usr/local目錄gpadmin用戶是否有權限創建文件,當然也可以使用其他目錄

這裏同樣採用騷操作是的gpadmin用戶具有/usr/local目錄的創建權限:

chown -R gpadmin:gpadmin /usr/local
安裝完成後,執行如下操作:
chown -R root:root /usr/local
chown -R gpadmin:gpadmin /usr/local/greenplum*

4.2、初始化

參考網址:http://docs-cn.greenplum.org/v6/pxf/init_pxf.html

PXF_CONF=/usr/local/greenplum-pxf $GPHOME/pxf/bin/pxf cluster init
[gpadmin@lgh1~]$ PXF_CONF=/usr/local/greenplum-pxf $GPHOME/pxf/bin/pxf cluster init
Initializing PXF on master host and 2 segment hosts...
PXF initialized successfully on 3 out of 3 hosts

4.3、配置環境變量

vim ~/.bashrc 添加如下內容,然後source一下

export PXF_CONF=/usr/local/greenplum-pxf
export PXF_HOME=/usr/local/greenplum-db/pxf/
export PATH=$PXF_HOME/bin:$PATH

4.4、查看安裝是否成功

pxf只會在segment的主機上安裝,所以下面只有兩個主機有安裝

[gpadmin@lgh1 ~]$ pxf cluster status
Checking status of PXF servers on 2 segment hosts...
ERROR: PXF is not running on 2 out of 2 hosts
lgh2 ==> Checking if tomcat is up and running...
ERROR: PXF is down - tomcat is not running...
lgh3 ==> Checking if tomcat is up and running...
ERROR: PXF is down - tomcat is not running...
[gpadmin@lgh1 ~]$ pxf cluster start
Starting PXF on 2 segment hosts...
PXF started successfully on 2 out of 2 hosts
[gpadmin@lgh1 ~]$ pxf cluster status
Checking status of PXF servers on 2 segment hosts...
PXF is running on 2 out of 2 hosts

pxf命令詳見:http://docs-cn.greenplum.org/v6/pxf/ref/pxf-ref.html

4.5、配置hadoop和hive連接器

參考官網:http://docs-cn.greenplum.org/v6/pxf/client_instcfg.html

說的有點繁瑣,一句話就是把所有的幾個配置copy過來,看看具體的結果就好:我是直接把hadoop和hive作爲默認的連接器:所以配置文件全部放在default下面即可:

[gpadmin@lgh1~]$ cd /usr/local/greenplum-pxf/servers/default/
[gpadmin@lgh1 default]$ ll
total 40
-rw-r--r-- 1 gpadmin gpadmin 4421 Mar 19 17:03 core-site.xml
-rw-r--r-- 1 gpadmin gpadmin 2848 Mar 19 17:03 hdfs-site.xml
-rw-r--r-- 1 gpadmin gpadmin 5524 Mar 23 14:58 hive-site.xml
-rw-r--r-- 1 gpadmin gpadmin 5404 Mar 19 17:03 mapred-site.xml
-rw-r--r-- 1 gpadmin gpadmin 1460 Mar 19 17:03 pxf-site.xml
-rw-r--r-- 1 gpadmin gpadmin 5412 Mar 19 17:03 yarn-site.xml

上面的core-site.xml,hdfs-site.xml,hive-site.xml,mapred-site.xml,yarn-site.xml文件均爲hadoop和hive的一些配置文件,均不要修改。因爲是CDH的集羣,所以如上的配置均在目錄中:/etc/hadoop/conf,/etc/hive/conf可以找到

pxf-site.xml這個文件主要是用來配置權限的,可以值得gpadmin可以訪問hadoop的目錄權限,這種方式比較簡單,當然還有其他的方式,內容如下:

[gpadmin@lgh1 default]$ cat pxf-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<configuration>
    <property>
        <name>pxf.service.kerberos.principal</name>
        <value>gpadmin/[email protected]</value>
        <description>Kerberos principal pxf service should use. _HOST is replaced automatically with hostnames FQDN</description>
    </property>
    <property>
        <name>pxf.service.kerberos.keytab</name>
        <value>${pxf.conf}/keytabs/pxf.service.keytab</value>
        <description>Kerberos path to keytab file owned by pxf service with permissions 0400</description>
    </property>
    <property>
        <name>pxf.service.user.impersonation</name>
        <value>false</value>
        <description>End-user identity impersonation, set to true to enable, false to disable</description>
    </property>
    <property>
        <name>pxf.service.user.name</name>
        <value>gpadmin</value>
        <description>
            The pxf.service.user.name property is used to specify the login
            identity when connecting to a remote system, typically an unsecured
            Hadoop cluster. By default, it is set to the user that started the
            PXF process. If PXF impersonation feature is used, this is the
            identity that needs to be granted Hadoop proxy user privileges.
            Change it ONLY if you would like to use another identity to login to
            an unsecured Hadoop cluster
        </description>
    </property>
</configuration>

如上內容的操作步驟:

1、 複製模板/usr/local/greenplum-pxf/templates/pxf-site.xml到目標server路徑/usr/local/greenplum-pxf/servers/default下
2、 修改pxf.service.user.name和pxf.service.user.impersonation配置

注意:這裏可能需要把hadoop整個集羣的hosts文件的內容複製到greenplum主機的hosts文件中

還有一種模擬用戶和代理的方式,需要重啓hadoop集羣,詳見:http://docs-cn.greenplum.org/v6/pxf/pxfuserimpers.html

4.6、同步和重啓pxf

[gpadmin@lgh1 ~]$ pxf cluster stop
Stopping PXF on 2 segment hosts...
PXF stopped successfully on 2 out of 2 hosts
[gpadmin@lgh1 ~]$ pxf cluster sync
Syncing PXF configuration files from master host to 2 segment hosts...
PXF configs synced successfully on 2 out of 2 hosts
[gpadmin@lgh1 ~]$ pxf cluster start
Starting PXF on 2 segment hosts...
PXF started successfully on 2 out of 2 hosts

這裏不做測試了,太繁瑣,因爲都實現過了,測試可以參考:http://docs-cn.greenplum.org/v6/pxf/access_hdfs.html

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章