第5章 實戰環境搭建

5-1 -課程目錄

實戰環境搭建

JDK安裝 Zookeeper安裝

Scala安裝 Hbase安裝

Maven安裝 Spark安裝

Maven安裝 Spark安裝

Hadoop安裝 IDEA+Maven+Spark Streaming

由於JDK和Zookeeper安裝已經在之前安裝過,所以本次課程不講解。

5-2 -Scala安裝

1、下載

wget https://downloads.lightbend.com/scala/2.11.8/scala-2.11.8.tgz

2、解壓

tar -zxvf scala-2.11.8.tgz -C /home/hadoop/app/

3、配置系統環境變量

vi ~/.bash_profile

export SCALA_HOME=/home/hadoop/app/scala-2.11.8

export PATH=$SCALA_HOME/bin:$PATH

source ~/.bash_profile

4、檢查是否安裝成功

輸入 Scala出現Scala就安裝成功了

5-3 -Maven安裝

1、下載

wget http://mirror.bit.edu.cn/apache/maven/maven-3/3.5.4/binaries/apache-maven-3.5.4-bin.tar.gz

2、解壓

# tar -zxvf apache-maven-3.5.4-bin.tar.gz -C /home/hadoop/app/

3、配置系統環境變量

vi ~/.bash_profile

export MAVEM_HOME=/home/hadoop/app/apache-maven-3.5.4

export PATH=$MAVEM_HOME/bin:$PATH

source ~/.bash_profile

4、檢查是否安裝成功

mav

5、修改默認倉庫的路徑

vi /home/hadoop/app/apache-maven-3.5.4/conf/settings.xml

<localRepository>/path/to/local/repo</localRepository>

 

5-4 -Hadoop環境搭建

 

一、hadoop環境搭建

1、安裝之前免密登錄

ssh 免密碼登錄(本步驟可以省略,但是後面重啓hadoop進程時需要手工輸入密碼纔行)

ssh-keygen -t rsa

執行後查看ssh : cd ~/.ssh/

cp ~/.ssh/id_rsa.pub ~/.ssh/authorized_keys

2、下載

wget http://archive.cloudera.com/cdh5/cdh/5/hadoop-2.6.0-cdh5.7.0.tar.gz

3、解壓

tar -zxvf hadoop-2.6.0-cdh5.7.0.tar.gz -C /home/hadoop/app/

4、配置參數

1) hadoop -env.sh

查看JAVE_HOME : echo $JAVA_HOME

加上配置JDK環境

 export  JAVA_HOME=/root/java/jdk1.8.0_161

2)core-site.xml

<property>

<name>fs.defaultFS</name>

<value>hafs://localhost:8020</value>

</property>

<property>

<name>hadoop.tmp.dir</name>

<value>/home/hadoop/app/tmp</value>

</property>

3)hdfs-site.xml

<property> <name>dfs.replication</name> <value>1</value> </property>

具體可以參照官方:http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/SingleCluster.html

5、配置系統變量

export HADOOP_HOME=/home/hadoop/app/hadoop-2.6.0-cdh5.7.0

export PATH=$HADOOP_HOME/bin:$PATH

source ~/.bash_profile

6、格式化HDFS

注意:這一步操作,只是在一次使用時格式化

1)Format the filesystem:

$ bin/ hdfs namenode -format

在bin 目錄下執行 ./hdfs namenode -format

6)啓動HDFS

Start NameNode daemon and DataNode daemon:

$ sbin/start-dfs.sh

 

7、檢查是否安裝成功

jps

 

 

 

二、YARN環境搭建

 

 

參看文檔:

http://hadoop.apache.org/docs/r3.0.3/hadoop-project-dist/hadoop-common/SingleCluster.html

1、配置

YARN on a Single Node

1、Configure parameters as follows:

etc/hadoop/mapred-site.xml:

<configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> </configuration>

etc/hadoop/yarn-site.xml:

<property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property>

2、啓動

NodeManager daemon:

$ sbin/start-yarn.sh

3、驗證

jps

ResourceManager

NodeManager

web:

Browse the web interface for the ResourceManager; by default it is available at:

4、停止YARN:

$ sbin/stop-yarn.sh

5-5 -HBase安裝

1、下載

wget http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.15.0.tar.gz

2、解壓

tar -zxvf hbase-1.2.0-cdh5.15.0.tar.gz -C /home/hadoop/app/

3、配置系統環境變量

vi ~/.bash_profile

export HBASE_HOME=/home/hadoop/app/hbase-1.2.0-cdh5.15.0

export PATH=$HBASE_HOME/bin:$PATH

source ~/.bash_profile

查看系統變量

echo $HBASE_HOME

4、配置文件 vi hbase-env.sh

1、導出JDK路徑

export JAVA_HOME=/root/java/jdk1.8.0_161

2、設置export HBASE_MANAGES_ZK=false

3、配置 vim hbase-site.xml

<property>

<name>hbase.rootdir</name>

<value>hdfs://hadoop000:8020/hbase</value>

</property>

 

<property>

<name>hbase.cluster.distributed</name>

<value>true</value>

</property>

 

<property>

<name>hbase.zookeeper.quorum</name>

<value>hadoop000:2181</value>

</property>

4、配置 vim regionservers

5、啓動Zookeeper ./zkServer.sh start

6、驗證是否啓動成功

1、 jps

多了二個進程

HMaster HRegionServer

2、瀏覽器hadoop000:60010

 

5-6 -Spark環境搭建

 

1、下載到官網(源碼編譯版本)

 

http://spark.apache.org/downloads.html

wget https://archive.apache.org/dist/spark/spark-2.1.0/spark-2.1.0.tgz

 

編譯步驟

http://spark.apache.org/docs/latest/building-spark.html

前置要求

1)The Maven-based build is the build of reference for Apache Spark. Building Spark using Maven requires Maven 3.3.9 or newer and Java 8+. Note that support for Java 7 was removed as of Spark 2.2.0.

2) export MAVEN_OPTS="-Xmx2g -XX:ReservedCodeCacheSize=512m"

mvn編譯命令

./build/mvn -Pyarn -Phadoop-2.7 -Dhadoop.version=2.7.3 -DskipTests clean package

前提:需要對maven有一定了解

./build/mvn -Pyarn -Phive -Phive-thriftserver -DskipTests clean package

spark源碼編譯

mvn編譯 make-distribution.sh

 

2、解壓

 

tar -zxvf spark-2.2.0-bin-2.6.0-cdh5.7.0.taz -C /home/hadoop/app/

 

3、配置系統環境變量

export SPARK_HOME=/home/hadoop/app/spark-2.2.0-bin-2.6.0-cdh5.7.0

export PATH=$SPARK_HOME/bin:$PATH

source ~/.bash_profile

4、檢查是否安裝成功

./spark-shell --master local[2]

 

5-7 -開發環境搭建

 

使用IDEA整合Maven搭建Spark Streaming 開發環境

 

 

在pom.xml添加對應的依賴

 

<properties>

<scala.version>2.11.8</scala.version>

<kafka.version>0.9.0.0</kafka.version>

<spark.version>2.2.0</spark.version>

<hadoop.version>2.6.0-cdh5.7.0</hadoop.version>

<hbase.version>1.2.0-cdh5.7.0</hbase.version>

</properties>

<!--添加cloudera的repository-->

<repositories>

<repository>

<id>cloudera</id>

<url>https://repository.cloudera.com/artifactory/cloudera-repos</url>

</repository>

</repositories>

<!-- Hadoop 依賴-->

<dependency>

<groupId>org.apache.hadoop</groupId>

<artifactId>hadoop-client</artifactId>

<version>${hadoop.version}</version>

</dependency>

<!-- HBase 依賴-->

<dependency>

<groupId>org.apache.hbase</groupId>

<artifactId>hbase-client</artifactId>

<version>${hbase.version}</version>

</dependency>

<dependency>

<groupId>org.apache.hbase</groupId>

<artifactId>hbase-server</artifactId>

<version>${hbase.version}</version>

</dependency>

 

<!-- Spark Streaming 依賴-->

<dependency>

<groupId>org.apache.spark</groupId>

<artifactId>spark-streaming_2.11</artifactId>

<version>${spark.version}</version>

</dependency>

 

 

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章