搭建hive運行環境

Hive 是建立在 Hadoop 上的數據倉庫基礎構架,是依賴於hadoop 客戶端和 hdfs文件系統,故搭建 hive運行環境 只需要 在一臺機器上 執行以下幾步:
1. 安裝hadoop客戶端 (前提是 hadoop集羣環境 已經ready),修改hadoop-site.xml 配置文件。
2. 配置hive的元數據庫---在mysql中建庫
3. 安裝hive客戶端,修改hive-site.xml文件。



一、下載hadoop客戶端,更改hadoop-site.xml配置文件.
        然後 更改環境變量.bash_profile

二、在已有的hive元數據庫,導出一份建庫建表語句。 作爲sql文件導出。
        然後在 新的mysql庫中, 直接導入 備份的sql dump文件。

     



 
   

三. 配置hive-site.xml,並修改 .bash_profile。



下面給出 hadoop-site.xml的樣例:


<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>


<configuration>


<!-- NEEDED TO CHANGE -->


<property>
  <name>hadoop.job.ugi</name>
  <value>user,passwd</value>
  <description>username, password used by client</description>
</property>


<property>
  <name>fs.default.name</name>
  <value>hdfs://nmg01-xxx.com:54310</value>
  <description>The name of the default file system.  A URI whose
  scheme and authority determine the FileSystem implementation.  The
  uri's scheme determines the config property (fs.SCHEME.impl) naming
  the FileSystem implementation class.  The uri's authority is used to
  determine the host, port, etc. for a filesystem.</description>
</property>


<property>
  <name>mapred.job.tracker</name>
  <value>nmg01-khan-job.dmop.xxxx.com:54311</value>
  <description>The host and port that the MapReduce job tracker runs
  at.  If "local", then jobs are run in-process as a single map
  and reduce task.
  </description>
</property>


<property>
 <name>dfs.http.address</name>
 <value>nmg01-khan-hdfs.dmop.xxxx.com:8070</value>
 <description>
   The address and the base port where the dfs namenode web ui will listen on.
   If the port is 0 then the server will start on a free port.
 </description>
</property>


<property>
 <name>mapred.job.queue.name</name>
 <value>xxxx</value>
 <description>tasktracker belongs to the group. If no set, it is "default" group </description>
</property>


</configuration>



下面給出hive-site.xml的樣例:

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>


<configuration>


<property>
  <name>hive.default.database.uri</name>
  <value>hdfs://nmg01-khan-hdfs.dmop.XXX.com:54310/app/ecom/aaa/hive</value>
  <description>location of default database for the warehouse</description>
</property>


<property>
  <name>hive.metastore.warehouse.dir</name>
  <value>/app/ecom/aaa/hive</value>
  <description>location of default database for the warehouse</description>
</property>


<property>
  <name>hive.exec.scratchdir</name>
  <value>hdfs://nmg01-khan-hdfs.dmop.XXX.com:54310/app/ecom/aaa/hive/scratch</value>
  <description>Scratch space for Hive jobs</description>
</property>


<property>
        <name>hive.stats.autogather</name>
        <value>false</value>
</property>






<property>
  <name>javax.jdo.option.ConnectionURL</name>
  <value>jdbc:mysql://cp01-ce.epc.XXX.com:7028/sz__crm?createDatabaseIfNotExist=true&amp;useUnicode=true&amp;characterEncoding=utf8</value>
  <description>JDBC connect string for a JDBC metastore</description>
</property>


<property>
  <name>javax.jdo.option.ConnectionDriverName</name>
  <value>com.mysql.jdbc.Driver</value>
  <description>Driver class name for a JDBC metastore</description>
</property>


<property>
  <name>javax.jdo.option.ConnectionUserName</name>
  <value>crm</value>
  <description>username to use against metastore database</description>
</property>


<property>
  <name>javax.jdo.option.ConnectionPassword</name>
  <value>12</value>
  <description>password to use against metastore database</description>
</property>


<property> 
<name>datanucleus.fixedDatastore</name> 
<value>true</value> 
</property>


<property>
  <name>hive.querylog.location</name>
  <value>/home/work/soft/hive_log</value>
  <description></description>
</property>


</configuration>


最後給出.bash_profile配置文件



# .bash_profile


# Get the aliases and functions
if [ -f ~/.bashrc ]; then
. ~/.bashrc
fi


# User specific environment and startup programs


export HADOOP_HOME=/home/work/soft/hadoop-client/hadoop
export JAVA_HOME=/home/work/soft/hadoop-client/java6
export CLASSPATH=$JAVA_HOME/lib:$JAVA_HOME/jre/lib:.
export DQUERY_HOME=/home/work/hadoop/sdk
export TASK_HOME=/home/work/local/etljet
export PYTHONPATH=$TASK_HOME:$PYTHONPATH
export MYSQL_HOME=/home/work/local/mysql/bin
export DDBS_HOME=/home/work/local/ddbstool
export HIVE_HOME=/home/work/soft/hive-2.3.33








PATH=/home/work/local/Python-2.7.8/bin:$JAVA_HOME/bin:$JAVA_HOME/jre/bin:$PATH:$HOME/bin:$HADOOP_HOME/bin:$TASK_HOME/bin:$MYSQL_HOME:$DDBS_HOME/bin:$DQUERY_HOME:$HIVE_HOME/bin:$SVN_HOME/bin:$ETLTEST_HOME/bin
export PYTHONPATH=$PYTHONPATH:$ETLTEST_HOME/src


LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/work/local/mysql/lib/mysql:/home/work/soft/hadoop-client/hadoop/lib/native/Linux-amd64-64
#####PATH=$PATH:$LIBRARY_PATH
export LANG=zh_CN.UTF-8
#export LANG=en_US


export LD_LIBRARY_PATH
alias menu=/home/work/fenglei02/bin/menu.sh
alias lst=/home/work/fenglei02/bin/list.sh
export PATH
unset USERNAME




export LANG=zh_CN.UTF8

       





發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章