中文文檔:https://drill.smartloli.org/ 缺省的,不建議看中文文檔
核心模塊:
執行過程圖:
執行過程描述:
- RPC endpoint: (接受查詢SQL)終端客戶機,建議用戶直接通過zk來管理Drill分佈式
- SQL parser:(SQL轉化成邏輯操作符語法)用的是Apache Calcite的SQL解析,用於優化語句中SQL操作符的順序,應用自定義規則將特定的SQL操作符轉換成特定的邏輯操作符語法
- Logical plan:(邏輯操作符語法轉換成邏輯計劃)邏輯計劃,邏輯操作符語法轉換成邏輯計劃,邏輯計劃是描述生成查詢結果所需的工作,並定義要應用哪些數據源和操作
- Optimizer:(邏輯計劃轉換成最優的物理查詢計劃)優化器,用於應用各種類型的規則將操作符和函數重新排列到一個最佳計劃中,優化器將邏輯計劃轉換爲描述如何執行查詢的物理計劃
- 物理計劃轉換爲多個階段,稱爲主要階段和次要階段。這些片段創建一個多級執行樹,重寫查詢並對配置的數據源並行執 行,將結果發送回客戶機或應用程序。
- 主要片段實際上不執行任何查詢任務,每個主要片段被劃分爲一個或多個次要片段(在下一節中討論),這些小片段實際執行完成查詢所需的操作,並將結果返回給客戶機。您還可以查看查詢概要文件中的主要片段,這在Drill Web UI中是可見的,也可以直接通過plan命令查看主要片段明細。例如,要執行兩個文件的哈希聚合,Drill可以創建一個包含兩個主要階段(主要片段)的計劃,其中第一個階段用於掃描兩個文件,第二個階段用於聚合數據。
- 次要片段每個主要的片段被並行化爲次要的片段。次要片段是在線程中運行的邏輯工作單元。Drill中的邏輯工作單元也稱爲片。Drill創建的執行計劃由較小的片段組成。Drill爲每個小片段分配一個minorfragmentd。一個主要的片段分解成儘可能多的次要片段,以便在集羣上同時有效地運行。次要片段數量可以在Web UI中查詢查看,修改一些配置選項,以更改較小片段的行爲,比如最大片數。
Storage plugin interface:元數據存儲
*Query Execution:
安裝與使用
1、安裝JDK 8 +版本
2、設置JAVA_HOME和PATH環境變量
3、安裝zookeeper 下載zookeeper-3.3.4,下載地址https://archive.apache.org/dist/zookeeper/,
解壓D:\zookeeper-3.3.4,
複製D:\zookeeper-3.3.4\conf\zoo_sample.cfg爲zoo.cfg,
新建D:\zookeeper-3.3.4\data文件夾,
設置zoo.cfg裏的dataDir=D:\zookeeper-3.3.4\data
4、啓動zookeeper,直接運行D:\zookeeper-3.3.4\bin\zkServer.cmd
5、下載Apache Drill:apache-drill-1.16.0.zip
6、解壓到D盤:D:\apache-drill-1.16.0
7、CMD管理員進入D:\apache-drill-1.16.0\bin
8、啓動Apache Drill:sqlline.bat -u "jdbc:drill:zk=local"或者drill-embedded.bat
(Supported in Drill 1.16 and later.)
啓動出現錯誤WARNING: All illegal access operations will be denied in a future release
Error: Failure in starting embedded Drillbit: org.apache.drill.common.exceptions
.DrillRuntimeException: Error during udf area creation [/C:/Users/Administrator/
drill/udf/registry] on file system [file:///] (state=,code=0)
解決辦法:CMD運行如下命令:
mkdir "%userprofile%\drill"
mkdir "%userprofile%\drill\udf"
mkdir "%userprofile%\drill\udf\registry"
mkdir "%userprofile%\drill\udf\tmp"
mkdir "%userprofile%\drill\udf\staging"
takeown /R /F "%userprofile%\drill"
ok 再次啓動:sqlline.bat -u "jdbc:drill:zk=local"或者drill-embedded.bat
(Supported in Drill 1.16 and later.)
9、校驗輸入:!tables查詢所有的系統默認表
您可以運行一個測試查詢來驗證,例如:
apache drill>use cp;
+------+--------------------------------+
| ok | summary |
+------+--------------------------------+
| true | Default schema changed to [cp] |
+------+--------------------------------+
//Query the employee.json file in the classpath.
apache drill (cp)>SELECT * FROM cp.`employee.json` LIMIT 1;
+-------------+--------------+------------+-----------+-------------+----------------+----------+---------------+------------+-----------------------+---------+---------------+-----------------+----------------+--------+-------------------+
| employee_id | full_name | first_name | last_name | position_id | position_title | store_id | department_id | birth_date | hire_date | salary | supervisor_id | education_level | marital_status | gender | management_role |
+-------------+--------------+------------+-----------+-------------+----------------+----------+---------------+------------+-----------------------+---------+---------------+-----------------+----------------+--------+-------------------+
| 1 | Sheri Nowmer | Sheri | Nowmer | 1 | President | 0 | 1 | 1961-08-26 | 1994-12-01 00:00:00.0 | 80000.0 | 0 | Graduate Degree | S | F | Senior Management |
+-------------+--------------+------------+-----------+-------------+----------------+----------+---------------+------------+-----------------------+---------+---------------+-----------------+----------------+--------+-------------------+
10、訪問Drill Web UI:http://localhost:8047/
11、設置登錄帳號密碼
見https://blog.csdn.net/fwx02/article/details/101059736
12、添加mongDB模塊的支持
13、添加MySQL模塊的支持
官方說明文檔:http://drill.apache.org/docs/using-the-jdbc-driver/
在目錄D:\apache-drill-1.16.0\jars\jdbc-driver加入com.mysql.jdbc.Driver對應的jar
14、退出
!quit
15、linux安裝Apache Drill
和windows安裝一樣,只是啓動的時候用bin/drillbit.sh start不要用官網的bin/drill-embedded,用官網的bin/drill-embedded會報錯:
[kduser@mdw apache-drill-1.16.0]$ bin/drill-embedded Exception in thread "main" java.io.IOError: java.lang.UnsupportedOperationException at org.jline.utils.Curses.tputs(Curses.java:62) at org.jline.utils.Curses.tputs(Curses.java:45) at org.jline.keymap.KeyMap.key(KeyMap.java:243) at org.jline.reader.impl.LineReaderImpl.key(LineReaderImpl.java:5666) at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175) at java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948) at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482) at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472) at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151) at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174) at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418) at org.jline.reader.impl.LineReaderImpl.bindKeys(LineReaderImpl.java:5674) at org.jline.reader.impl.LineReaderImpl.emacs(LineReaderImpl.java:5387) at org.jline.reader.impl.LineReaderImpl.defaultKeyMaps(LineReaderImpl.java:5363) at org.jline.reader.impl.LineReaderImpl.<init>(LineReaderImpl.java:266) at org.jline.reader.LineReaderBuilder.build(LineReaderBuilder.java:115) at sqlline.SqlLine.getConsoleReader(SqlLine.java:626) at sqlline.SqlLine.begin(SqlLine.java:527) at sqlline.SqlLine.start(SqlLine.java:270) at sqlline.SqlLine.main(SqlLine.java:201) Caused by: java.lang.UnsupportedOperationException at org.jline.utils.Curses.doTputs(Curses.java:78) at org.jline.utils.Curses.tputs(Curses.java:60) ... 21 more
如果啓動報錯,檢查配置
drill-override.conf
drill.exec: {
cluster-id: "drillbits1",
zk.connect: "localhost:2181"
}
設置內存大小
drill-env.sh
export DRILL_MAX_DIRECT_MEMORY="2G"
export DRILL_HEAP="1G"
16、MongoDB安裝配置
16.1、rpm包安裝(建議不要選rpm包,下面介紹TGZ包的安裝步驟)下載:https://www.mongodb.com/download-center/community?jmp=nav
rpm包:mongodb-org-server-4.2.0-1.el6.x86_64.rpm
rpm -ivh mongodb-org-server-4.2.0-1.el6.x86_64.rpm
修改/etc/mongod.conf的dbPath systemLog path,註釋掉bindIp:127.0.0.1 #xxx
mkdir dbPath 和 systemLog path
啓動service mongod start
停止service mongod stop
重啓service mongod restart
如果出現pid已經存在的錯誤刪除了重新啓動就ok
卸載
yum erase $(rpm -qa | grep mongodb-org)
rm -r /var/log/mongodb
rm -r /var/lib/mongo
16.2、TGZ包安裝:
1、sudo vi /etc/profile
插入下列內容:
export MONGODB_HOME=/apps/db/mongodb
export PATH=$PATH:$MONGODB_HOME/bin
立即生效
source /etc/profile
2、創建
sudo mkdir /apps/db/mongodb/data
sudo chmod -R 777 /apps/db/mongodb/data
sudo touch /apps/db/mongodb/data/logs/mongodb.log
3、配置
sudo touch /apps/db/mongodb/mongodb.conf
sudo vim /apps/db/mongodb/mongodb.conf
#數據存放地址
dbpath=/apps/db/mongodb/data
#log存放地址
logpath=/apps/db/mongodb/data/logs/mongodb.log
#端口號
port=27017
#以守護程序運行,即後臺運行
fork=true
nohttpinterface=true
4、啓動
cd /apps/db/mongodb/bin
./mongod -f mongodb.conf
5、測試連接 curl localhost:27017
6、MongoDB客戶端安裝和用戶密碼權限控制,建議安裝studio-3t