安裝
系統環境
linux版本:redhat6
jdk:jdk1.7
1.本地安裝與測試
1.1安裝
1.1.1下載Drill M1 binary release
1.1.2 解壓apache-drill-1.0.0-m1-binary-release.tar.gz並做鏈接
tar -zxf apache-drill-1.0.0-m1-binary-release.tar.gz
做link鏈接
ln -s apache-drill-1.0.0-m1 drill
1.1.3 配置環境變量
export DRILL_HOME=/home/{username}/drill
export PATH=$PATH:$DRILL_HOME/bin
1.2測試
1.2.1連接
[sudo] sqlline -u jdbc:drill:schema=parquet-local -n admin -p admin
解析:schema原生定義了5種類型:
parquet-local(本地parquet),parquet-cp(classpath-parquet), jsonl(本地json),parquet(classpath-parquet),parquet
具體的定義,參照conf/storage-engines.json
1.2.2退出
jdbc:drill:schema=parquet-local> !q
1.2.3運行一個QUERY
select * from “sample-data/region.parquet";
語句指南
https://developers.google.com/bigquery/query-reference
https://cwiki.apache.org/confluence/display/DRILL/Running+Queries
2. 分佈式安裝與測試
2.1安裝
2.1.1.安裝Hadoop
當前drill的原生支持的版本爲hadoop1.2
http://litongbupt.iteye.com/blog/1473179
http://litongbupt.iteye.com/blog/1473265
啓動hadoop
2.1.2.安裝Zookeeper
官網推薦安裝Zookeeper3.4.3,經筆者測試,3.4.5也是可以使用的。
部署並啓動zookeeper
http://litongbupt.iteye.com/admin/blogs/1987737
2.1.3 部署drill的分佈式模式
- 修改conf/drill-override.conf文件 zk:connect:“{zookeeper地址}:2181”
- 修改conf/storage-engines文件
"parquet" :
{
"type":"parquet",
"dfsName" : “hdfs://{hadoop的namenode地址}:9000”
},
"json" :
{
"type":"json",
"dfsName" : "hdfs://{hadoop的namenode地址}:9000"
}
- 將drill目錄拷貝到其他節點
- 將.bashrc拷貝到其他節點
- 在每一個節點啓動drill: sudo drillbit.sh start
2.2測試
2.2.1測試drill集羣是否啓動成功
zkCli.sh -server {zookeeper地址}:2181
get /drill/drillbits1
cZxid = 0x100000003
ctime = Tue Dec 10 10:18:42 CST 2013
mZxid = 0x100000003
mtime = Tue Dec 10 10:18:42 CST 2013
pZxid = 0x10000001c
cversion = 12
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 0
numChildren = 4
這次測試用了numChildren = 4個節點
2.2.2測試QUERY
把數據放到HDFS上 hadoop fs -put sample-data /
鏈接集羣 sqlline -u jdbc:drill:schema=parquet
SELECT _MAP['R_REGIONKEY'] as region_key, _MAP['R_NAME'] AS name, _MAP['R_COMMENT'] AS comment FROM “/sample-data/region.parquet";
SELECT count(distinct _MAP['N_REGIONKEY']) FROM “/sample-data/nation.parquet";
SELECT _MAP['N_REGIONKEY'] as regionKey, _MAP['N_NAME'] as name FROM “/sample-data/nation.parquet" WHERE cast(_MAP['N_NAME'] as varchar) < 'M';
2.3 關閉集羣
2.3.1關閉drill集羣
在每個節點上執行 sudo drillbit.sh stop
2.3.2關閉zookeeper
在每個節點上執行 sudo zkServer.sh stop
2.3.3在namenode上執行
sudo stop-all.sh