Hadoop/Spark平臺搭建

add user

useradd ITS-Hadoop
passwd ITS-Hadoop

ssh 無密碼訪問

ssh-keygen -t rsa -P ''
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
chmod 600 ~/.ssh/authorized_keys
chmod 700 ~/.ssh

切換到root

su
/root/script/scpfloder.sh /home/ITS-Hadoop/.ssh /home/ITS-Hadoop/
/root/script/runcommand.sh "chown -R ITS-Hadoop:ITS-Hadoop /home/ITS-Hadoop/.ssh"

配置主節點基礎軟件

hadoop

chown -R ITS-Hadoop:ITS-Hadoop /usr/local/hadoop-2.6.0

開始配置文件

cd /usr/local/hadoop-2.6.0/etc/hadoop/
ls

capacity-scheduler.xml      hadoop-policy.xml           mapred-env.cmd
configuration.xsl           hdfs-site.xml               mapred-env.sh
container-executor.cfg      httpfs-env.sh               mapred-queues.xml.template
core-site.xml               httpfs-log4j.properties     mapred-site.xml
.core-site.xml.swn          httpfs-signature.secret     mapred-site.xml.template
.core-site.xml.swo          httpfs-site.xml             slaves
.core-site.xml.swp          kms-acls.xml                ssl-client.xml.example
hadoop-env.cmd              kms-env.sh                  ssl-server.xml.example
hadoop-env.sh               kms-log4j.properties        yarn-env.cmd
hadoop-metrics2.properties  kms-site.xml                yarn-env.sh
hadoop-metrics.properties   log4j.properties            yarn-site.xml

需要配置的文件是

# core-site.xml hadoop-env.sh hdfs-site.xml yarn-env.sh yarn-site.xml slaves

zookeeper

chown -R ITS-Hadoop:ITS-Hadoop /usr/local/zookeeper-3.4.6

開始配置文件

cd /usr/local/zookeeper-3.4.6/
vi conf/zoo.cfg

    # The number of milliseconds of each tick
    tickTime=2000
    # The number of ticks that the initial 
    # synchronization phase can take
    initLimit=10
    # The number of ticks that can pass between 
    # sending a request and getting an acknowledgement
    syncLimit=5
    # the directory where the snapshot is stored.
    # do not use /tmp for storage, /tmp here is just 
    # example sakes.
    dataDir=/usr/local/zookeeper-3.4.6/var/data
    dataLogDir=/usr/local/zookeeper-3.4.6/var/datalog
    # the port at which the clients will connect
    clientPort=2181
    server.1=hadoop5:2888:3888
    server.2=hadoop6:2888:3888
    server.3=hadoop7:2888:3888

vi var/data/myid

    3

hbase

chown -R ITS-Hadoop:ITS-Hadoop /usr/local/hbase-1.1.4

開始配置文件

cd /usr/local/hbase-1.1.4/conf/
ls

hadoop-metrics2-hbase.properties  hbase-env.sh      hbase-site.xml    regionservers
hbase-env.cmd                     hbase-policy.xml  log4j.properties

需要配置的文件是

# hbase-env.sh hbase-site.xml regionservers

spark

chown -R ITS-Hadoop:ITS-Hadoop /usr/local/scala-2.10.4
chown -R ITS-Hadoop:ITS-Hadoop /usr/local/spark-1.4.1-bin-hadoop2.6

開始配置文件

cd /usr/local/spark-1.4.1-bin-hadoop2.6/conf/
ls

derby.log          log4j.properties             slaves
docker.properties  metrics.properties           spark-defaults.conf
fairscheduler.xml  metrics.properties.template  spark-env.sh

需要配置的文件是

# spark-defaults.conf spark-env.sh slaves

修改環境變量配置文件

vi /etc/profile

export JAVA_HOME=/usr/local/java/jdk1.7
export JRE_HOME=/usr/local/java/jdk1.7/jre
export CLASSPATH=.:$JAVA_HOME/lib:$JRE_HOME/lib:$JAVA_HOME/jre/lib/rt.jar:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export PATH=$JAVA_HOME/bin:$JRE_HOME/bin:$PATH

export HADOOP_HOME=/usr/local/hadoop-2.6.0
export HADOOP_DEV_HOME=/usr/local/hadoop
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export HDFS_CONF_DIR=$HADOOP_HOME/etc/hadoop
export YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop
export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH

export ZOOKEEPER_HOME=/usr/local/zookeeper-3.4.6
export PATH=$ZOOKEEPER_HOME/bin:$PATH

export HBASE_HOME=/usr/local/hbase-1.1.4
export PATH=$HBASE_HOME/bin:$PATH

export SPARK_HOME=/usr/local/spark-1.4.1-bin-hadoop2.6
export PATH=$SPARK_HOME/bin:$PATH
export SPARK_EXAMPLES_JAR=$SPARK_HOME/lib/spark-examples-1.4.1-hadoop2.6.0.jar

export SCALA_HOME=/usr/local/scala-2.10.4
export PATH=$SCALA_HOME/bin:$PATH

export CLASSPATH=$CLASSPATH:$HADOOP_HOME/lib:$SPARK_HOME/lib:$HIVE_HOME/lib:SHBASE_HOME/lib:$SCALA_HOME/lib

清理原始安裝文件

略…..

拷貝文件(ITS-Hadoop 用戶模式下)

~/script/scpfloder.sh 
~/script/scpfloder.sh /usr/local/hadoop-2.6.0 /usr/local
~/script/runcommand.sh "chown -R ITS-Hadoop:ITS-Hadoop /usr/local/hadoop-2.6.0"
~/script/scpfloder.sh /usr/local/zookeeper-3.4.6 /usr/local
~/script/runcommand.sh "chown -R ITS-Hadoop:ITS-Hadoop /usr/local/zookeeper-3.4.6"
~/script/scpfloder.sh /usr/local/hbase-1.1.4 /usr/local
~/script/runcommand.sh "chown -R ITS-Hadoop:ITS-Hadoop /usr/local/hbase-1.1.4"
~/script/scpfloder.sh /usr/local/scala-2.10.4 /usr/local
~/script/runcommand.sh "chown -R ITS-Hadoop:ITS-Hadoop /usr/local/scala-2.10.4"
~/script/scpfloder.sh /usr/local/spark-1.4.1-bin-hadoop2.6 /usr/local
~/script/runcommand.sh "chown -R ITS-Hadoop:ITS-Hadoop /usr/local/spark-1.4.1-bin-hadoop2.6"

修改相關節點的zookeeper中的my.id文件

略…..

創建所需文件夾

mkdir /home/ITS-Hadoop/hbase /home/ITS-Hadoop/hbase/logs /home/ITS-Hadoop/hbase/tmp
~/script/runcommand.sh "mkdir /home/ITS-Hadoop/hbase /home/ITS-Hadoop/hbase/logs /home/ITS-Hadoop/hbase/tmp"
mkdir /home/ITS-Hadoop/dfs /home/ITS-Hadoop/dfs/name /home/ITS-Hadoop/dfs/log /home/ITS-Hadoop/data /home/ITS-Hadoop/tmp
~/script/runcommand.sh "mkdir /home/ITS-Hadoop/dfs /home/ITS-Hadoop/dfs/name /home/ITS-Hadoop/dfs/log /home/ITS-Hadoop/dfs/data /home/ITS-Hadoop/dfs/tmp"

HDFS 格式化

/usr/local/hadoop-2.6.0/bin/hdfs namenode -format 

需要先殺死所有的jps進程(刪除/tmp下的文件)

hadoop dfs -mkdir /hbase
hadoop dfs -mkdir /sparkLog

安裝thrift

Thrift的編譯器使用C++編寫的,在安裝編譯器之前,首先應該保證操作系統基本環境支持C++的編譯,安裝相關依賴的軟件包,如下所示

yum install automake libtool flex bison pkgconfig gcc-c++ boost-devel libevent-devel zlib-devel python-devel ruby-devel openssl-devel

下載Thrift的軟件包,並解壓縮

wget http://archive.apache.org/dist/thrift/0.9.1/thrift-0.9.1.tar.gz
tar xf thrift-0.9.1.tar.gz
cd thrift-0.9.1
./configure
make
make install
thrift --help

生成hbase的thrift接口

thrift --gen py /usr/local/hbase-1.1.4-src/hbase-thrift/src/main/resources/org/apache/hadoop/hbase/thrift/Hbase.thrift

這樣在當前目錄就生成了gen-py目錄
Hbase.py 中定義了一些HbaseClient可以使用的方法
ttypes.py中定義了HbaseClient傳輸的數據類型
將生成的hbase目錄copy到python的包下 (路徑需要根據pythonpath而定)
有的是/usr/lib/python2.7/site-packages/ , 有的是 /usr/local/python27/lib/python2.7/site-packages

cp -r gen-py/hbase /usr/local/python27/lib/python2.7/site-packages  

啓動hbase和thrift服務

./bin/start-hbase.sh  
./bin/hbase-daemon.sh start thrift  

安裝redis

// yum install redis -y
wget http://download.redis.io/releases/redis-3.2.0.tar.gz
tar xf redis-3.2.0.tar.gz
cd redis-3.2.0
make

啓動命令 redis-server 在/usr/local/bin目錄下,如果沒有需要把路徑加入到 PATH 中

安裝 redis 的 python 包

安裝easy_install

( https://pypi.python.org/pypi/setuptools#downloads )
wget --no-check-certificate https://bootstrap.pypa.io/ez_setup.py
python ez_setup.py --insecure
Installing easy_install script to /usr/local/python27/bin

建立軟連接

ln -s /usr/local/python27/bin/easy_install /usr/local/bin/

安裝 redis 包

easy_install redis

安裝pip

安裝ganglia

安裝ganglia服務

# 在所有節點安裝
yum install ganglia-gmond -y
# 在主節點安裝:和gmond安裝中介紹的相同,如果本地軟件庫不提供gmetad,那麼需要安裝EPEL。
yum install ganglia-gmetad -y  

安裝gweb(主節點安裝)

##首先安裝Apache和PHP
yum install httpd php -y
vim /etc/php.d/json.ini

    extension=json.ini

安裝gweb(主節點安裝)

wget http://downloads.sourceforge.net/project/ganglia/ganglia-web/3.7.1/ganglia-web-3.7.1.tar.gz
tar -xf ganglia-web-3.7.1.tar.gz
cd ganglia-web-3.7.1

修改 Makefile

vim Makefile
    #修改默認配置:
    GDESTDIR = /var/www/html/ganglia2
    APACHE_USER = apache
    #注意:GDESTDIR 和 APACHE_USER 要與APACHE的配置文件(/etc/httpd/conf/httpd.conf)中的  DocumentRoot 、 apache保持一致

make install

Django

tar xf Django-1.6.11.tar
cd Django-1.6.11
python setup.py install

拷貝工程

scp -r project/gugou 10.2.15.107:~/project
scp -r project/BYSJ 10.2.15.107:~/project

啓動集羣

#start hadoop
/usr/local/hadoop-2.6.0/sbin/start-dfs.sh
/usr/local/hadoop-2.6.0/sbin/start-yarn.sh

# start hbase
## start zookeeper
/usr/local/zookeeper-3.4.6/bin/zkServer.sh start
ssh hadoop5 "/usr/local/zookeeper-3.4.6/bin/zkServer.sh start"
ssh hadoop6 "/usr/local/zookeeper-3.4.6/bin/zkServer.sh start"
## start hbase
/usr/local/hbase-1.1.4/bin/start-hbase.sh
## 若regionservers未啓動, 則執行start regionservers
~/script/runcommand.sh "source /etc/profile;/usr/local/hbase-1.1.4/bin/hbase-daemon.sh start regionserver"
## start thrift
/usr/local/hbase-1.1.4/bin/hbase-daemon.sh start thrift

#start spark
/usr/local/spark-1.4.1-bin-hadoop2.6/sbin/start-all.sh

啓動服務

# 需要啓動所有ganglia
service gmond start
# 在主節點執行
service gmetad start
# /usr/bin/nc localhost 8651 &

service mysqld start & 

redis-server &

安裝python的thrift 包

easy_install thrift

安裝 MySQLdb module

wget --no-check-certificate https://sourceforge.net/projects/mysql-python/files/mysql-python/1.2.3/MySQL-python-1.2.3.tar.gz/download
tar xf MySQL-python-1.2.3.tar.gz
cd MySQL-python-1.2.3
python setup.py install

安裝 requests module

easy_install requests

安裝 happybase module

easy_install happybase

安裝JPype

wget --no-check-certificate https://pypi.python.org/packages/3c/94/b620c0e0143c864141ea572a7ad831d8233d84d5702cef692bc039f1c9c1/JPype1-0.6.1.tar.gz
tar xf JPype1-0.6.1.tar.gz 
cd JPype1-0.6.1
python setup.py install
# 
pip install JayDeBeApi

安裝 rrdtool

yum install cairo-devel libxml2-devel pango-devel pango libpng-devel freetype freetype-devel libart_lgpl-devel
wget http://oss.oetiker.ch/rrdtool/pub/rrdtool-1.3.1.tar.gz
tar xf rrdtool-1.3.1.tar.gz
cd rrdtool-1.3.1
./configure --prefix=/usr/local/rrdtool && make && make install
ln -s /usr/local/rrdtool/bin/* /usr/bin/

安裝 python-rrdtool

wget --no-check-certificate https://pypi.python.org/packages/99/af/bf46df3104d78591f942278467a1016d056a887c808ed1127207a4e1ebaf/python-rrdtool-1.4.7.tar.gz
tar xf python-rrdtool-1.4.7.tar.gz
cd python-rrdtool-1.4.7
python setup.py install

數據庫導入數據

略…

開啓服務

cd /home/ITS-Hadoop/project/BYSJ/wrapper
python ETLWrapperServer.py &

service iptables stop
python manage.py runserver 0.0.0.0:9999


#stop spark
    /usr/local/spark-1.4.1-bin-hadoop2.6/sbin/stop-all.sh

# stop hbase
    ## stop thrift
    /usr/local/hbase-1.1.4/bin/hbase-daemon.sh stop thrift
    ## stop regionservers
    ~/script/runcommand.sh "source /etc/profile;/usr/local/hbase-1.1.4/bin/hbase-daemon.sh stop regionserver"
    ## stop hbase
    /usr/local/hbase-1.1.4/bin/stop-hbase.sh
    ## stop zookeeper
    /usr/local/zookeeper-3.4.6/bin/zkServer.sh stop
    ssh slave2 "/usr/local/zookeeper-3.4.6/bin/zkServer.sh stop"
    ssh slave3 "/usr/local/zookeeper-3.4.6/bin/zkServer.sh stop"

#stop hadoop
    /usr/local/hadoop-2.6.0/sbin/stop-dfs.sh
    /usr/local/hadoop-2.6.0/sbin/stop-yarn.sh

問題

RECEIVED SIGNAL 15: SIGTERM

2016-05-13 22:16:56,908 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: RECEIVED SIGNAL 15: SIGTERM
2016-05-13 22:16:56,912 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at hadoop7/127.0.0.1
************************************************************/
2016-05-16 09:46:47,963 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = hadoop7/127.0.0.1
STARTUP_MSG:   args = []
STARTUP_MSG:   version = 2.6.0

發現 “host = hadoop7/127.0.0.1” 這裏應該是局域網地址,而不是127.0.0.1,是/etc/hosts文件的問題
將 ::1 對應的主機名更換一下,不叫hadoop7

python 升級

python 升級後 yum還要用2.6版本,需要修改一下
python 升級後 python環境變量有點問題,有一些module路徑需要修改:增加軟鏈接

regionserver 節點開啓後又停止

可能是節點的時間不一致,使用時間同步即可
vi /etc/crontab

0-59/10 * * * * /usr/sbin/ntpdate ITS-Hadoop10

/home/ITS-Hadoop/script/./scpfile.sh /etc/crontab /etc

關閉hadoop任務

/usr/local/hadoop-2.6.0/bin/yarn application -kill application_1464678840184_0026

ganglia 權限問題

There was an error collecting ganglia data (127.0.0.1:8652): fsockopen error: Permission denied
參考: http://www.songyawei.cn/content/2064
setenforce 0

ganglia 無法獲取到硬件信息

原因:新安裝的gmetad大小寫不敏感,所以生成的RRD文件所在文件夾是小寫的,而我們獲取時使用的是大寫的主機名,所以找不到,詳情請看如下配置文件片段

vi /etc/ganglia/gmetad.conf

141 # In earlier versions of gmetad, hostnames were handled in a case
142 # sensitive manner
143 # If your hostname directories have been renamed to lower case,
144 # set this option to 0 to disable backward compatibility.
145 # From version 3.2, backwards compatibility will be disabled by default.
146 # default: 1   (for gmetad < 3.2)
147 # default: 0   (for gmetad >= 3.2)
148 case_sensitive_hostnames 1

Zlib 下載安裝

官網: http://www.zlib.net/
http://prdownloads.sourceforge.net/libpng/zlib-1.2.8.tar.gz?download

wget http://prdownloads.sourceforge.net/libpng/zlib-1.2.8.tar.gz?download
tar -xvzf zlib-1.2.8.tar.gz
cd zlib-1.2.8
./configure
make
sudo make install
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章