大數據平臺學習之路(4)編譯hue並使用

1、背景

上篇博客已經記錄了安裝hive,編譯spark的過程,但是spark-sql shell或者hive shell編寫sql來並不是很方便,所以這篇博客介紹hue,編譯hue並使用。

2、HUE簡介

github地址 https://github.com/cloudera/hue

Hue是一個開放源碼的SQLWorkbench,用於瀏覽、查詢和可視化數據倉庫中的數據:gethue.com

更多介紹請查看下面兩篇其他作者的博客  

https://blog.csdn.net/ywheel1989/article/details/51560312

https://blog.csdn.net/liangyihuai/article/details/54137163

3、準備文件

https://codeload.github.com/cloudera/hue/zip/master

4、安裝所需依賴(官方介紹)

http://cloudera.github.io/hue/latest/administrator/installation/dependencies/

安裝在你想安裝的節點

sudo apt-get install git ant gcc g++ libffi-dev libkrb5-dev libmysqlclient-dev libsasl2-dev libsasl2-modules-gssapi-mit libsqlite3-dev libssl-dev libxml2-dev libxslt-dev make maven libldap2-dev python-dev python-setuptools libgmp3-dev

安裝Node.js

sudo apt install curl
curl -sL https://deb.nodesource.com/setup_8.x | sudo bash -
sudo apt-get install -y nodejs

5、編譯安裝hue

執行下面命令,網速會很慢,可以通過github官網下載,然後拷貝到虛擬機中

git clone https://github.com/cloudera/hue.git
cd hue
make apps

修改參數文件設置中文界面(也可以通過默認文件在用戶設置中修改爲中文)

gedit /home/hadoop/hue-master/desktop/core/src/desktop/settings.py

修改爲

$ make locales
$ make install

之後的安裝路徑是/usr/local/hue

在運行上面命令的時候安裝mysql

$ sudo apt-get install mysql-server mysql-client

  在mysql中創建數據庫並賦權。

create database hue charset utf8;
create database huedb charset utf8;

GRANT ALL PRIVILEGES ON hue.* TO 'hue'@'localhost' IDENTIFIED BY 'hue';
GRANT ALL PRIVILEGES ON hue.* TO 'hue'@'192.168.0.12' IDENTIFIED BY 'hue';
GRANT ALL PRIVILEGES ON hue.* TO 'hue'@'%' IDENTIFIED BY 'hue';
GRANT ALL PRIVILEGES ON huedb.* TO 'hue'@'localhost' IDENTIFIED BY 'hue';
GRANT ALL PRIVILEGES ON huedb.* TO 'hue'@'192.168.0.12' IDENTIFIED BY 'hue';
GRANT ALL PRIVILEGES ON huedb.* TO 'hue'@'%' IDENTIFIED BY 'hue';
$ vim /usr/local/hue/desktop/conf/pseudo-distributed.ini
[desktop]
  secret_key='7c07c6b8fbb5048d06d1ff6150f67efcc1cb921f'
  http_host=192.168.0.12
  http_port=8888
  time_zone=Asia/Shanghai
  server_user=hue
  server_group=hadoop
  default_user=hue
  default_hdfs_superuser=hadoop
  [[database]]
    engine=mysql
    host=localhost
    port=3306
    user=hue
    password=hue
	name=hue
[notebook]
  show_notebooks=true
  enable_external_statements=true
  enable_batch_execute=true
  enable_sql_indexer=false
  enable_presentation=true
  enable_query_builder=true
  enable_query_scheduling=false
  [[interpreters]]
    [[[mysql]]]
      name = MySQL
      interface=sqlalchemy
      ## https://docs.sqlalchemy.org/en/latest/dialects/mysql.html
      ## options='{"url": "mysql://root:root@localhost:3306/hue"}'
      options='{"url": "mysql://hue:hue@localhost:3306/hue"}'
    [[[sparksql]]]
      name=SparkSql
      interface=hiveserver2
    #[[[sparksql]]]
      ## name=SparkSql
      ##interface=sqlalchemy
      ##options='{"url": "hive://hive@data1:10000/mdw"}'
[dashboard]
  is_enabled=true
  has_sql_enabled=true
  [[engines]]
    analytics=true
    nesting=false
[hadoop]
  [[hdfs_clusters]]
    [[[default]]]
      fs_defaultfs=hdfs://master:9000
      logical_name=hadoop
      webhdfs_url=http://master:50070/webhdfs/v1
      hadoop_hdfs_home=/usr/local/hadoop-2.8.5
      hadoop_conf_dir=/usr/local/hadoop-2.8.5/etc/hadoop
      hadoop_bin=/usr/local/hadoop-2.8.5/bin
  [[yarn_clusters]]
    [[[default]]]
      resourcemanager_host=master
      resourcemanager_port=8032
      submit_to=true
      resourcemanager_api_url=http://master:8088
      proxy_api_url=http://master:8088
      history_server_api_url=http://master:19888
      spark_history_server_url=http://master:18088
[beeswax]
  hive_server_host=192.168.0.11
  hive_server_port=10000
  hive_metastore_host=192.168.0.11
  hive_metastore_port=9083
  hive_conf_dir=/usr/local/spark-2.3.3/conf
  server_conn_timeout=120
[metastore]
  enable_new_create_table=true
  force_hs2_metadata=true
[spark]
  livy_server_url=http://master:8998
  livy_server_host=192.168.0.10
  livy_server_session_kind=yarn
  csrf_enabled=false
  sql_server_host=192.168.0.11
  sql_server_port=10000
[jobbrowser]
  disable_killing_jobs=false
  enable_v2=true
  enable_query_browser=true
[librdbms]
  [[databases]]
    [[[mysql]]]
      nice_name="My SQL DB"
      name=huedb
      engine=mysql
      host=localhost
      port=3306
      user=hue
      password=hue

添加hue的環境變量

 

$ vim ~/.bashrc

export HUE_ENV=/usr/local/hue/build/env
export PATH=$PATH:$HUE_ENV/bin

初始化數據庫

$ hue syncdb
$ hue migrate

然後啓動hue並配置密碼

$ supervisor

如果要運行hive或者sparksql 需要啓動hadoop  hive元數據庫以及spark-thriftserver.sh(不要使用hiveserver2,無法兼容) 

但是需要運行scala 或者spark或者pyspark 時需要 livy服務

接下來配置livy-server

6.配置live-server

 下載livy(只需要主節點安裝)

 http://archive.apache.org/dist/incubator/livy/0.5.0-incubating/livy-0.5.0-incubating-bin.zip

解壓到usr/local目錄

修改livy.conf


# What port to start the server on.
livy.server.session.factory=yarn
livy.server.port = 8998
livy.repl.enable-hive-context = true
livy.server.csrf_protection.enabled = false
# What spark master Livy sessions should use.
livy.spark.master = yarn-client

# What spark deploy mode Livy sessions should use.
livy.spark.deploy-mode = client

修改livy-env.sh,添加如下內容

export LIVY_HOME=/usr/local/livy-0.5.0
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
export HADOOP_CONF_DIR=/usr/local/hadoop-2.8.5/etc/hadoop
export SPARK_HOME=/usr/local/spark-2.3.3
export SPARK_CONF_DIR=/usr/local/spark-2.3.3/conf
export SCALA_HOME=/usr/local/scala-2.11.8
export LIVY_LOG_DIR=${LIVY_HOME}/logs
export LIVY_PID_DIR=${LIVY_HOME}/run
export SCALA_HOME=/usr/local/scala-2.11.8
export PYSPARK_ALLOW_INSECURE_GATEWAY=1

配置環境變量,添加如下內容

export PATH=$PATH:$NPM_HOME/bin
export LIVY_HOME=/usr/local/livy-0.5.0
export PATH=$PATH:$LIVY_HOME/bin
export PYSPARK_ALLOW_INSECURE_GATEWAY=1

更新環境變量

source ~/.bashrc

同時需要在spark-env.sh添加如下內容

export PYSPARK_ALLOW_INSECURE_GATEWAY=1

啓動livy

livy-server start

現在就可以在hue下使用scala和pyspark了

關閉livy

livy-server stop

7.配置Hue完成

歡迎大家進行交流,一起學習進步,同時我的配置文件會放在微雲上,在配置時可以參考,如果有問題可以隨時留言或者發郵箱給我[email protected],我會盡快解決。

參數配置文件 鏈接:https://share.weiyun.com/5nfMu7B 密碼:74bx47 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章