walter的drill筆試之二-安裝與部署

安裝

系統環境

linux版本:redhat6

jdk:jdk1.7

1.本地安裝與測試

1.1安裝

1.1.1下載Drill M1 binary release 

http://people.apache.org/~jacques/apache-drill-1.0.0-m1.rc3/apache-drill-1.0.0-m1-binary-release.tar.gz

1.1.2 解壓apache-drill-1.0.0-m1-binary-release.tar.gz並做鏈接

tar -zxf apache-drill-1.0.0-m1-binary-release.tar.gz

做link鏈接

ln -s apache-drill-1.0.0-m1 drill

1.1.3 配置環境變量

export DRILL_HOME=/home/{username}/drill

export PATH=$PATH:$DRILL_HOME/bin

1.2測試

1.2.1連接

[sudo] sqlline -u jdbc:drill:schema=parquet-local -n admin -p admin

解析:schema原生定義了5種類型:

parquet-local(本地parquet),parquet-cp(classpath-parquet), jsonl(本地json),parquet(classpath-parquet),parquet

具體的定義,參照conf/storage-engines.json

1.2.2退出

jdbc:drill:schema=parquet-local> !q

1.2.3運行一個QUERY

select * from “sample-data/region.parquet";

語句指南

https://developers.google.com/bigquery/query-reference

 https://cwiki.apache.org/confluence/display/DRILL/Running+Queries

 

2. 分佈式安裝與測試

​2.1安裝

2.1.1.安裝Hadoop

當前drill的原生支持的版本爲hadoop1.2

http://litongbupt.iteye.com/blog/1473179

http://litongbupt.iteye.com/blog/1473265

啓動hadoop

2.1.2.安裝Zookeeper

官網推薦安裝Zookeeper3.4.3,經筆者測試,3.4.5也是可以使用的。

部署並啓動zookeeper

http://litongbupt.iteye.com/admin/blogs/1987737

2.1.3 部署drill的分佈式模式

  •  修改conf/drill-override.conf文件 zk:connect:“{zookeeper地址}:2181”
  •  修改conf/storage-engines文件

       "parquet" :

      {

        "type":"parquet",

        "dfsName" : “hdfs://{hadoop的namenode地址}:9000”

      },

    "json" :

      {

        "type":"json",

        "dfsName" : "hdfs://{hadoop的namenode地址}:9000"

      }

  •  將drill目錄拷貝到其他節點
  •  將.bashrc拷貝到其他節點
  •  在每一個節點啓動drill:   sudo drillbit.sh start

2.2測試

2.2.1測試drill集羣是否啓動成功

zkCli.sh -server {zookeeper地址}:2181

get /drill/drillbits1

cZxid = 0x100000003

ctime = Tue Dec 10 10:18:42 CST 2013

mZxid = 0x100000003

mtime = Tue Dec 10 10:18:42 CST 2013

pZxid = 0x10000001c

cversion = 12

dataVersion = 0

aclVersion = 0

ephemeralOwner = 0x0

dataLength = 0

numChildren = 4

這次測試用了numChildren = 4個節點

2.2.2測試QUERY

把數據放到HDFS上   hadoop fs -put sample-data /

鏈接集羣 sqlline -u jdbc:drill:schema=parquet

SELECT _MAP['R_REGIONKEY'] as region_key, _MAP['R_NAME'] AS name, _MAP['R_COMMENT'] AS comment FROM “/sample-data/region.parquet";

SELECT count(distinct _MAP['N_REGIONKEY']) FROM “/sample-data/nation.parquet";   

SELECT _MAP['N_REGIONKEY'] as regionKey, _MAP['N_NAME'] as name FROM “/sample-data/nation.parquet" WHERE cast(_MAP['N_NAME'] as varchar) < 'M';

 

2.3 關閉集羣

2.3.1關閉drill集羣

在每個節點上執行 sudo drillbit.sh stop 

2.3.2關閉zookeeper

在每個節點上執行 sudo zkServer.sh stop

2.3.3在namenode上執行

sudo stop-all.sh

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章