Druid 單機部署
有很多文章都介紹了Druid,大數據實時分析,在此我就不多說了。本文主要描述如何部署Druid的環境,Imply提供了一套完整的部署方式,包括依賴庫,Druid,圖形化的數據展示頁面,SQL查詢組件等,Push攝入數據Tranquility Server配置。
一、環境安裝前準備:
- Java 8 https://download.oracle.com/otn-pub/java/jdk/8u191-b12/2787e4a523244c269598db4e85c51e0c/jdk-8u191-linux-x64.tar.gz
- Node.js 4.5.x
- Linux, Mac OS X (不支持 Windows )
- At least 4GB of RAM
二、安裝JAVA 8 :
- 新增 Java 目錄 mkdir /usr/local/java
- 解壓JDK tar -zxvf jdk-8u191-linux-x64.tar.gz
- 配置環境變量
# JAVA_HOME
export JAVA_HOME=/usr/local/java/jdk1.8.0_191
export CLASSPATH=.:$JAVA_HOME/jre/lib/rt.jar:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export PATH=$PATH:$JAVA_HOME/bin
- 環境變量需要重啓生效 source /ect/profile
- 驗證JDK java -version
三、Node.js 安裝
1、去官網下載和自己系統匹配的文件:
英文網址:https://nodejs.org/en/download/
中文網址:http://nodejs.cn/download/
2、下載下來的tar文件上傳到服務器並且解壓,然後通過建立軟連接變爲全局;
1)上傳服務器可以是自己任意路徑,目前我的放置路徑爲 cd /usr/local/software
2)解壓上傳 tar -xvf node-v10.13.0-linux-x64.tar.xz
3)建立軟連接,變爲全局
- ln -s /usr/local/software/node-v10.13.0-linux-x64/bin/npm /usr/local/bin/
- ln -s /usr/local/software/node-v10.13.0-linux-x64/bin/node /usr/local/bin/
4)最後一步檢驗nodejs是否已變爲全局 node -v 說明安裝成功。
三、下載與安裝 imply
- 從 https://imply.io/get-started 下載最新版本安裝包
- tar -zxvf imply-2.7.12.tar.gz
- cd imply-2.7.12
- 啓動項目 nohup bin/supervise -c conf/supervise/quickstart.conf > quickstart.log &
- 如果啓動出現上圖 請重新安裝 perl Centos7 下面執行:yum install perl
- 重新啓動就好了
安裝驗證
** 導入測試數據、安裝包中包含一些測試的數據,可以通過執行預先定義好的數據說明文件進行導入 **
# 導入數據,進入 imply-2.7.12 執行下面語句
[root@strom imply-2.7.12]# bin/post-index-task --file quickstart/wikipedia-index.json
Beginning indexing data for wikipedia
Task started: index_wikipedia_2018-11-22T07:39:13.068Z
Task log: http://localhost:8090/druid/indexer/v1/task/index_wikipedia_2018-11-22T07:39:13.068Z/log
Task status: http://localhost:8090/druid/indexer/v1/task/index_wikipedia_2018-11-22T07:39:13.068Z/status
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task finished with status: SUCCESS
Completed indexing data for wikipedia. Now loading indexed data onto the cluster...
wikipedia is 0.0% finished loading...
wikipedia is 0.0% finished loading...
wikipedia is 0.0% finished loading...
wikipedia is 0.0% finished loading...
wikipedia is 0.0% finished loading...
wikipedia loading complete! You may now query your data
[root@strom imply-2.7.12]#
四、可視化控制檯
-
overlord 控制頁面:http://192.168.164.136:8090/console.html
-
druid集羣頁面:http://192.168.164.136:8081
-
數據可視化頁面:http://192.168.164.136:9095
-
數據查詢
五、Druid的數據攝入主要包括兩大類:
1. 實時輸入攝入:包括Pull,Push兩種
- Pull:需要啓動一個RealtimeNode節點,通過不同的Firehose攝取不同種類的數據源。
- Push:需要啓動Tranquility或是Kafka索引服務。通過HTTP調用的方式進行數據攝入
2. 實時數據攝入
2.1 Pull
由於Realtime Node 沒有提供高可用,可伸縮等特性,對於比較重要的場景推薦使用 Tranquility Server or 或是Tranquility Kafka索引服務
2.2 Push
通過Tranquility 的數據攝入,可以分爲兩種方式
Tranquility Server:發送方可以通過Tranquility Server 提供的HTTP接口,向Druid發送數據。
Tranquility Kafka:發送發可以先將數據發送到Kafka,Tranquility Kafka會根據配置從Kafka獲取數據,並寫到Druid中。
2.2.1 Tranquility Server配置:
開啓Tranquility Server,在數據節點上編輯conf/supervise/quickstart.conf 文件,將Tranquility Server註釋放開
[root@strom imply-2.7.12]# cd conf/supervise/
[root@strom supervise]# ls
data.conf master-no-zk.conf master-with-zk.conf query.conf quickstart.conf
[root@strom supervise]# vi quickstart.conf
:verify bin/verify-java
:verify bin/verify-default-ports
:verify bin/verify-version-check
:kill-timeout 10
!p10 zk bin/run-zk conf-quickstart
coordinator bin/run-druid coordinator conf-quickstart
broker bin/run-druid broker conf-quickstart
historical bin/run-druid historical conf-quickstart
!p80 overlord bin/run-druid overlord conf-quickstart
!p90 middleManager bin/run-druid middleManager conf-quickstart
imply-ui bin/run-imply-ui-quickstart conf-quickstart
# Uncomment to use Tranquility Server 把此處的註釋去掉的
!p95 tranquility-server bin/tranquility server -configFile conf-quickstart/tranquility/server.json
# Uncomment to use Tranquility Kafka
#!p95 tranquility-kafka bin/tranquility kafka -configFile conf-quickstart/tranquility/kafka.json
# Uncomment to use Tranquility Clarity metrics server
#!p95 tranquility-metrics-server java -Xms2g -Xmx2g -cp "dist/tranquility/lib/*:dist/tranquility/conf" com.metamx.tranquility.distribution.DistributionMain server -configFile conf-quickstart/tranquility/server-for-metrics.yaml
:wq!
2.2.2 查看 conf-quickstart/tranquility/server.json
{
"dataSources" : [
{
"spec" : {
"dataSchema" : {
"dataSource" : "tutorial-tranquility-server",
"parser" : {
"type" : "string",
"parseSpec" : {
"timestampSpec" : {
"column" : "timestamp",
"format" : "auto"
},
"dimensionsSpec" : {
"dimensions" : [],
"dimensionExclusions" : [
"timestamp",
"value"
]
},
"format" : "json"
}
},
"granularitySpec" : {
"type" : "uniform",
"segmentGranularity" : "hour",
"queryGranularity" : "none"
},
"metricsSpec" : [
{
"type" : "count",
"name" : "count"
},
{
"name" : "value_sum",
"type" : "doubleSum",
"fieldName" : "value"
},
{
"fieldName" : "value",
"name" : "value_min",
"type" : "doubleMin"
},
{
"type" : "doubleMax",
"name" : "value_max",
"fieldName" : "value"
}
]
},
"ioConfig" : {
"type" : "realtime"
},
"tuningConfig" : {
"type" : "realtime",
"maxRowsInMemory" : "50000",
"intermediatePersistPeriod" : "PT10M",
"windowPeriod" : "PT10M"
}
},
"properties" : {
"task.partitions" : "1",
"task.replicants" : "1"
}
}
],
"properties" : {
"zookeeper.connect" : "localhost",
"druid.discovery.curator.path" : "/druid/discovery",
"druid.selectors.indexing.serviceName" : "druid/overlord",
"http.port" : "8200",
"http.threads" : "40",
"serialization.format" : "smile",
"druidBeam.taskLocator": "overlord"
}
}
- “dataSource” : “tutorial-tranquility-server” 可以改成自己需要的 dataSource
2.2.3. 重新啓動項目,首先要down 掉上次啓動程序
[root@strom imply-2.7.12]# bin/service --down
[root@strom imply-2.7.12]# nohup bin/supervise -c conf/supervise/quickstart.conf > quickstart.log &
出現以下信息,證明啓動成功
[root@strom imply-2.7.12]# tail -f quickstart.log
[Thu Nov 22 16:05:20 2018] Running command[zk], logging to[/usr/local/druid/imply-2.7.12/var/sv/zk.log]: bin/run-zk conf-quickstart
[Thu Nov 22 16:05:20 2018] Running command[coordinator], logging to[/usr/local/druid/imply-2.7.12/var/sv/coordinator.log]: bin/run-druid coordinator conf-quickstart
[Thu Nov 22 16:05:20 2018] Running command[broker], logging to[/usr/local/druid/imply-2.7.12/var/sv/broker.log]: bin/run-druid broker conf-quickstart
[Thu Nov 22 16:05:20 2018] Running command[historical], logging to[/usr/local/druid/imply-2.7.12/var/sv/historical.log]: bin/run-druid historical conf-quickstart
[Thu Nov 22 16:05:20 2018] Running command[overlord], logging to[/usr/local/druid/imply-2.7.12/var/sv/overlord.log]: bin/run-druid overlord conf-quickstart
[Thu Nov 22 16:05:20 2018] Running command[middleManager], logging to[/usr/local/druid/imply-2.7.12/var/sv/middleManager.log]: bin/run-druid middleManager conf-quickstart
[Thu Nov 22 16:05:20 2018] Running command[imply-ui], logging to[/usr/local/druid/imply-2.7.12/var/sv/imply-ui.log]: bin/run-imply-ui-quickstart conf-quickstart
[Thu Nov 22 16:05:20 2018] Running command[tranquility-server], logging to[/usr/local/druid/imply-2.7.12/var/sv/tranquility-server.log]: bin/tranquility server -configFile conf-quickstart/tranquility/server.json
2.2.4. 進行測試類編寫
# HTTP util
import java.io.IOException;
import java.net.SocketTimeoutException;
import java.security.GeneralSecurityException;
import java.security.cert.CertificateException;
import java.security.cert.X509Certificate;
import java.util.ArrayList;
import java.util.List;
import java.util.Map;
import java.util.Map.Entry;
import java.util.Set;
import javax.net.ssl.SSLContext;
import javax.net.ssl.SSLException;
import javax.net.ssl.SSLSession;
import javax.net.ssl.SSLSocket;
import org.apache.commons.io.IOUtils;
import org.apache.commons.lang.StringUtils;
import org.apache.http.Consts;
import org.apache.http.HttpEntity;
import org.apache.http.HttpResponse;
import org.apache.http.NameValuePair;
import org.apache.http.client.HttpClient;
import org.apache.http.client.config.RequestConfig;
import org.apache.http.client.config.RequestConfig.Builder;
import org.apache.http.client.entity.UrlEncodedFormEntity;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.client.methods.HttpPost;
import org.apache.http.conn.ConnectTimeoutException;
import org.apache.http.conn.ssl.SSLConnectionSocketFactory;
import org.apache.http.conn.ssl.SSLContextBuilder;
import org.apache.http.conn.ssl.TrustStrategy;
import org.apache.http.conn.ssl.X509HostnameVerifier;
import org.apache.http.entity.ContentType;
import org.apache.http.entity.StringEntity;
import org.apache.http.impl.client.CloseableHttpClient;
import org.apache.http.impl.client.HttpClients;
import org.apache.http.impl.conn.PoolingHttpClientConnectionManager;
import org.apache.http.message.BasicNameValuePair;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
/**
* HTTP
* @Description:
* @author:DARUI LI
* @version:1.0.0
* @Data:2018年11月22日 下午4:52:31
*/
public class HttpUtil {
public static final int connTimeout = 5000;
public static final int readTimeout = 5000;
public static final String charset = "UTF-8";
private static HttpClient client = null;
private static Logger logger = LoggerFactory.getLogger(HttpUtil.class);
static {
PoolingHttpClientConnectionManager cm = new PoolingHttpClientConnectionManager();
cm.setMaxTotal(128);
cm.setDefaultMaxPerRoute(128);
client = HttpClients.custom().setConnectionManager(cm).build();
}
public static String postJson(String url, String json) throws Exception {
return post(url, json, "application/json", charset, connTimeout, readTimeout);
}
public static String postParameters(String url, String parameterStr) throws Exception {
return post(url, parameterStr, "application/x-www-form-urlencoded", charset, connTimeout, readTimeout);
}
public static String postParameters(String url, String parameterStr, String charset, Integer connTimeout, Integer readTimeout) throws Exception {
return post(url, parameterStr, "application/x-www-form-urlencoded", charset, connTimeout, readTimeout);
}
public static String postParameters(String url, Map<String, String> params) throws
Exception {
return postForm(url, params, null, connTimeout, readTimeout);
}
public static String postParameters(String url, Map<String, String> params, Integer connTimeout, Integer readTimeout) throws
Exception {
return postForm(url, params, null, connTimeout, readTimeout);
}
public static String get(String url) throws Exception {
return get(url, charset, null, null);
}
public static String get(String url, String charset) throws Exception {
return get(url, charset, connTimeout, readTimeout);
}
/**
* 發送一個 Post 請求, 使用指定的字符集編碼.
*
* @param url
* @param body RequestBody
* @param mimeType 例如 application/xml "application/x-www-form-urlencoded" a=1&b=2&c=3
* @param charset 編碼
* @param connTimeout 建立鏈接超時時間,毫秒.
* @param readTimeout 響應超時時間,毫秒.
* @return ResponseBody, 使用指定的字符集編碼.
* @throws ConnectTimeoutException 建立鏈接超時異常
* @throws SocketTimeoutException 響應超時
* @throws Exception
*/
public static String post(String url, String body, String mimeType, String charset, Integer connTimeout, Integer readTimeout)
throws Exception {
long startTime = System.currentTimeMillis();
HttpClient client = null;
HttpPost post = new HttpPost(url);
String result = "";
try {
if (StringUtils.isNotBlank(body)) {
HttpEntity entity = new StringEntity(body, ContentType.create(mimeType));
// HttpEntity entity = new StringEntity(body, ContentType.create(mimeType, charset));
post.setEntity(entity);
}
// 設置參數
Builder customReqConf = RequestConfig.custom();
if (connTimeout != null) {
customReqConf.setConnectTimeout(connTimeout);
}
if (readTimeout != null) {
customReqConf.setSocketTimeout(readTimeout);
}
post.setConfig(customReqConf.build());
HttpResponse res;
if (url.startsWith("https")) {
client = createSSLInsecureClient();
res = client.execute(post);
} else {
client = HttpUtil.client;
res = client.execute(post);
}
result = IOUtils.toString(res.getEntity().getContent(), charset);
long endTime = System.currentTimeMillis();
logger.info("HttpClient Method:[post] ,URL[" + url + "] ,Time:[" + (endTime - startTime) + "ms] ,result:" + result);
} finally {
post.releaseConnection();
if (url.startsWith("https") && client != null && client instanceof CloseableHttpClient) {
((CloseableHttpClient) client).close();
}
}
return result;
}
/**
* 提交form表單
*
* @param url
* @param params
* @param connTimeout
* @param readTimeout
* @return
* @throws ConnectTimeoutException
* @throws SocketTimeoutException
* @throws Exception
*/
public static String postForm(String url, Map<String, String> params, Map<String, String> headers, Integer connTimeout, Integer readTimeout) throws
Exception {
long startTime = System.currentTimeMillis();
HttpClient client = null;
HttpPost post = new HttpPost(url);
String result = "";
try {
if (params != null && !params.isEmpty()) {
List<NameValuePair> formParams = new ArrayList<NameValuePair>();
Set<Entry<String, String>> entrySet = params.entrySet();
for (Entry<String, String> entry : entrySet) {
formParams.add(new BasicNameValuePair(entry.getKey(), entry.getValue()));
}
UrlEncodedFormEntity entity = new UrlEncodedFormEntity(formParams, Consts.UTF_8);
post.setEntity(entity);
}
if (headers != null && !headers.isEmpty()) {
for (Entry<String, String> entry : headers.entrySet()) {
post.addHeader(entry.getKey(), entry.getValue());
}
}
// 設置參數
Builder customReqConf = RequestConfig.custom();
if (connTimeout != null) {
customReqConf.setConnectTimeout(connTimeout);
}
if (readTimeout != null) {
customReqConf.setSocketTimeout(readTimeout);
}
post.setConfig(customReqConf.build());
HttpResponse res = null;
if (url.startsWith("https")) {
// 執行 Https 請求.
client = createSSLInsecureClient();
res = client.execute(post);
} else {
// 執行 Http 請求.
client = HttpUtil.client;
res = client.execute(post);
}
result = IOUtils.toString(res.getEntity().getContent(), charset);
long endTime = System.currentTimeMillis();
logger.info("HttpClient Method:[postForm] ,URL[" + url + "] ,Time:[" + (endTime - startTime) + "ms] ,result:" + result);
} finally {
post.releaseConnection();
if (url.startsWith("https") && client != null && client instanceof CloseableHttpClient) {
((CloseableHttpClient) client).close();
}
}
return result;
}
/**
* 發送一個 GET 請求
*
* @param url
* @param charset
* @param connTimeout 建立鏈接超時時間,毫秒.
* @param readTimeout 響應超時時間,毫秒.
* @return
* @throws ConnectTimeoutException 建立鏈接超時
* @throws SocketTimeoutException 響應超時
* @throws Exception
*/
public static String get(String url, String charset, Integer connTimeout, Integer readTimeout)
throws Exception {
long startTime = System.currentTimeMillis();
HttpClient client = null;
HttpGet get = new HttpGet(url);
String result = "";
try {
// 設置參數
Builder customReqConf = RequestConfig.custom();
if (connTimeout != null) {
customReqConf.setConnectTimeout(connTimeout);
}
if (readTimeout != null) {
customReqConf.setSocketTimeout(readTimeout);
}
get.setConfig(customReqConf.build());
HttpResponse res = null;
if (url.startsWith("https")) {
// 執行 Https 請求.
client = createSSLInsecureClient();
res = client.execute(get);
} else {
// 執行 Http 請求.
client = HttpUtil.client;
res = client.execute(get);
}
result = IOUtils.toString(res.getEntity().getContent(), charset);
long endTime = System.currentTimeMillis();
logger.info("HttpClient Method:[postForm] ,URL[" + url + "] ,Time:[ " + (endTime - startTime) + "ms ] ,result:" + result);
} finally {
get.releaseConnection();
if (url.startsWith("https") && client != null && client instanceof CloseableHttpClient) {
((CloseableHttpClient) client).close();
}
}
return result;
}
/**
* 從 response 裏獲取 charset
*
* @param ressponse
* @return
*/
@SuppressWarnings("unused")
private static String getCharsetFromResponse(HttpResponse ressponse) {
if (ressponse.getEntity() != null && ressponse.getEntity().getContentType() != null && ressponse.getEntity().getContentType().getValue() != null) {
String contentType = ressponse.getEntity().getContentType().getValue();
if (contentType.contains("charset=")) {
return contentType.substring(contentType.indexOf("charset=") + 8);
}
}
return null;
}
/**
* 創建 SSL連接
*
* @return
* @throws GeneralSecurityException
*/
private static CloseableHttpClient createSSLInsecureClient() throws GeneralSecurityException {
try {
SSLContext sslContext = new SSLContextBuilder().loadTrustMaterial(null, new TrustStrategy() {
public boolean isTrusted(X509Certificate[] chain, String authType) throws CertificateException {
return true;
}
}).build();
SSLConnectionSocketFactory sslsf = new SSLConnectionSocketFactory(sslContext, new X509HostnameVerifier() {
public boolean verify(String arg0, SSLSession arg1) {
return true;
}
public void verify(String host, SSLSocket ssl) throws IOException {
}
public void verify(String host, X509Certificate cert) throws SSLException {
}
public void verify(String host, String[] cns, String[] subjectAlts) throws SSLException {
}
});
return HttpClients.custom().setSSLSocketFactory(sslsf).build();
} catch (GeneralSecurityException e) {
throw e;
}
}
}
/**
* Druid Tranquility Server http 請求多線程 demo
* @Description:
* @author:DARUI LI
* @version:1.0.0
* @Data:2018年11月22日 下午4:55:33
*/
public class DruidThreadTest {
private static final int THREADNUM = 10;// 線程數量
public static void main(String[] args) {
// 線程數量
int threadmax = THREADNUM;
for (int i = 0; i < threadmax; i++) {
ThreadMode thread = new ThreadMode();
thread.getThread().start();
}
}
}
import java.util.Map;
import org.joda.time.DateTime;
import com.alibaba.fastjson.JSON;
import com.bitup.strom.uitl.HttpUtil;
import com.google.common.collect.ImmutableMap;
/**
* 執行程序 多線程訪問
* @Description:
* @author:DARUI LI
* @version:1.0.0
* @Data:2018年11月22日 下午4:57:49
*/
public class ThreadMode {
public Thread getThread() {
Thread thread = new Thread(new Runnable() {
@Override
public void run() {
long start = System.currentTimeMillis();
for (int i = 0; i < 10; i++) {
System.out.print("\nout:" + i);
final Map<String, Object> obj = ImmutableMap.<String, Object>of("timestamp", new DateTime().toString(),"test5",i);
try {
String postJson = HttpUtil.postJson("http://192.168.162.136:8200/v1/post/tutorial-tranquility-server", JSON.toJSONString(obj));
System.err.println(postJson);
} catch (Exception e) {
e.printStackTrace();
}
}
long end = System.currentTimeMillis();
System.out.println("start time:" + start+ "; end time:" + end+ "; Run Time:" + (end - start) + "(ms)");
}
});
return thread;
}
}
2.2.5 Tranquility Kafka配置:
開啓Tranquility Kafka,在數據節點上編輯conf/supervise/quickstart.conf 文件,將Tranquility Kafka註釋放開
[root@strom imply-2.7.12]# cd conf/supervise/
[root@strom supervise]# ls
data.conf master-no-zk.conf master-with-zk.conf query.conf quickstart.conf
[root@strom supervise]# vi quickstart.conf
:verify bin/verify-java
:verify bin/verify-default-ports
:verify bin/verify-version-check
:kill-timeout 10
!p10 zk bin/run-zk conf-quickstart
coordinator bin/run-druid coordinator conf-quickstart
broker bin/run-druid broker conf-quickstart
historical bin/run-druid historical conf-quickstart
!p80 overlord bin/run-druid overlord conf-quickstart
!p90 middleManager bin/run-druid middleManager conf-quickstart
imply-ui bin/run-imply-ui-quickstart conf-quickstart
# Uncomment to use Tranquility Server
#!p95 tranquility-server bin/tranquility server -configFile conf-quickstart/tranquility/server.json
# Uncomment to use Tranquility Kafka 把此處的註釋去掉的
!p95 tranquility-kafka bin/tranquility kafka -configFile conf-quickstart/tranquility/kafka.json
# Uncomment to use Tranquility Clarity metrics server
#!p95 tranquility-metrics-server java -Xms2g -Xmx2g -cp "dist/tranquility/lib/*:dist/tranquility/conf" com.metamx.tranquility.distribution.DistributionMain server -configFile conf-quickstart/tranquility/server-for-metrics.yaml
:wq!
2.2.6 詳細配置可參考:
http://druid.io/docs/0.10.1/tutorials/tutorial-kafka.html
配置參考
通用配置:https://github.com/druid-io/tranquility/blob/master/docs/configuration.md
數據攝入通用配置:http://druid.io/docs/latest/ingestion/index.html
Tranquility Kafka:https://github.com/druid-io/tranquility/blob/master/docs/kafka.md
Druid 查詢數據
1、基本sql查詢
druid 查詢接口的使用
druid的查詢接口是HTTP REST 風格的查詢方式,使用HTTP REST 風格查詢(Broker,Historical,或者Realtime)節點的數據,查詢參數爲JSON格式,每個節點類型都會暴露相同的REST查詢接口
curl -X POST '<queryable_host>:<port>/druid/v2/?pretty' -H 'Content-Type:application/json' -d @<query_json_file>
queryable_host: broker節點ip port: broker 節點端口 默認是8082
curl -L -H'Content-Type: application/json' -XPOST --data-binary @quickstart/aa.json http://10.20.23.41:8082/druid/v2/?pretty
query 查詢的類型有
1、Timeseries
2、TopN
3、GroupBy
4、Time Boundary
5、Segment Metadata
6、Datasource Metadata
7、Search
8、select
其中 Timeseries、TopN、GroupBy爲聚合查詢,Time Boundary、Segment Metadata、Datasource Metadata 爲元數據查詢,Search 爲搜索查詢
1、Timeseries
對於需要統計一段時間內的彙總數據,或者是指定時間粒度的彙總數據,druid可以通過Timeseries來完成。
timeseries 查詢包括如下的字段:
字段名 | 描述 | 是否必須 |
---|---|---|
queryType | 查詢類型,這裏只有填寫timeseries查詢 | 是 |
dataSource | 要查詢的數據集 | 是 |
descending | 是否降序 | 是 |
queryType | 查詢類型,這裏只有填寫timeseries查詢 | 否 |
intervals | 查詢的時間範圍,默認是ISO-8601格式 | 是 |
granularity | 查詢結果進行聚合的時間粒度 | 是 |
filter | 過濾條件 | 否 |
aggregations | 聚合 | 是 |
postAggregations | 後期聚合 | 否 |
context | 指定一些查詢參數 | 否 |
granularity | 查詢結果進行聚合的時間粒度 | 是 |
timeseries輸出每個時間粒度內指定條件的統計信息,通過filter指定條件過濾,通過aggregations和postAggregations指定聚合方式。
timeseries不能輸出維度信息,granularity支持all,none,second,minute,hour,day,week,month,year等維度
all:彙總1條輸出 none:不推薦使用
其他的:則輸出相應粒度統計信息
查詢的json
{
"aggregations": [
{
"type": "count",
"name": "count"
}
],
"intervals": "1917-08-25T08:35:20+00:00/2017-08-25T08:35:20+00:00",
"dataSource": "app_auto_prem_qd_pp3",
"granularity": "all",
"postAggregations": [],
"queryType": "timeseries"
}
等同於sql select count(1) from app_auto_prem_qd_pp3
TopN 返回指定維度和排序字段的有序top-n序列.TopN支持返回前N條記錄,並支持指定的Metric爲排序依據
{
"metric": "sum__total_standard_premium",
"aggregations": [
{
"type": "doubleSum",
"fieldName": "total_standard_premium",
"name": "sum__total_standard_premium"
}
],
"dimension": "is_new_car",
"intervals": "1917-08-29T20:05:10+00:00/2017-08-29T20:05:10+00:00",
"dataSource": "app_auto_prem_qd_pp3",
"granularity": "all",
"threshold": 50000,
"postAggregations": [],
"queryType": "topN"
}
字段名 | 描述 | 是否必須 |
---|---|---|
queryType | 對於TopN查詢,這個必須是TopN | 是 |
dataSource | 要查詢的數據集 | 是 |
intervals | 查詢的時間範圍,默認是ISO-8601格式 | 是 |
filter | 過濾條件 | 否 |
aggregations | 聚合 | 是 |
postAggregations | 後期聚合 | 否 |
dimension | 進行TopN查詢的維護,一個TopN查詢只能有一個維度 | 是 |
threshold | TopN中的N值 | 是 |
metric | 進行統計並排序的metric | 是 |
context | 指定一些查詢參數 | 否 |
metric:是TopN專屬
方式:
"metric":"<metric_name>" 默認情況是升序排序的
"metric" : {
"type" : "numeric", //指定按照numeric 降序排序
"metric" : "<metric_name>"
}
"metric" : {
"type" : "inverted", //指定按照numeric 升序排序
"metric" : "<metric_name>"
}
"metric" : {
"type" : "lexicographic", //指定按照字典序排序
"metric" : "<metric_name>"
}
"metric" : {
"type" : "alphaNumeric", //指定按照數字排序
"metric" : "<metric_name>"
}
需要注意的是,TopN是一個近似算法,每一個segment返回前1000條進行合併得到最後的結果,如果dimension
的基數在1000以內,則是準確的,超過1000就是近似值
groupBy
groupBy 類似於SQL中的group by 操作,能對指定的多個維度進行分組,也支持對指定的維度進行排序,並輸出limit行數,同時支持having操作
{
"dimensions": [
"is_new_car",
"status"
],
"aggregations": [
{
"type": "doubleSum",
"fieldName": "total_standard_premium",
"name": "sum__total_standard_premium"
}
],
"having": {
"type": "greaterThan",
"aggregation": "sum__total_standard_premium",
"value": "484000"
},
"intervals": "1917-08-29T20:26:52+00:00/2017-08-29T20:26:52+00:00",
"limitSpec": {
"limit": 2,
"type": "default",
"columns": [
{
"direction": "descending",
"dimension": "sum__total_standard_premium"
}
]
},
"granularity": "all",
"postAggregations": [],
"queryType": "groupBy",
"dataSource": "app_auto_prem_qd_pp3"
}
等同於SQL select is_new_car,status,sum(total_standard_premium) from app_auto_prem_qd_pp3 group by is_new_car,status limit 50000 having sum(total_standard_premium)>484000
{
"version" : "v1",
"timestamp" : "1917-08-30T04:26:52.000+08:00",
"event" : {
"sum__total_standard_premium" : 8.726074368E9,
"is_new_car" : "是",
"status" : null
}
}, {
"version" : "v1",
"timestamp" : "1917-08-30T04:26:52.000+08:00",
"event" : {
"sum__total_standard_premium" : 615152.0,
"is_new_car" : "否",
"status" : null
}
}
字段名 | 描述 | 是否必須 |
---|---|---|
queryType | 對於GroupBy查詢,該字段必須是GroupBy | 是 |
dataSource | 要查詢的數據集 | 是 |
dimensions | 進行GroupBy查詢的維度集合 | 是 |
limitSpec | 統計結果進行排序 | 否 |
having | 對統計結果進行篩選 | 否 |
granularity | 查詢結果進行聚合的時間粒度 | 是 |
postAggregations | 後聚合器 | 否 |
intervals | 查詢的時間範圍,默認是ISO-8601格式 | 是 |
context | 指定一些查詢參數 | 否 |
GroupBy特有的字段爲limitSpec 和having
limitSpec 指定排序規則和limit的行數
{
"type" : "default",
"limit":<integer_value>,
"columns":[list of OrderByColumnSpec]
}
其中columns是一個數組,可以指定多個排序字段,排序字段可以使demension 或者metric 指定排序規則的拼寫方式
{
"dimension" :"<Any dimension or metric name>",
"direction" : <"ascending"|"descending">
}
"limitSpec": {
"limit": 2,
"type": "default",
"columns": [
{
"direction": "descending",
"dimension": "sum__total_standard_premium"
},
{
"direction": "ascending",
"dimension": "is_new_car"
}
]
}
having 類似於SQL中的having操作
select 類似於sql中select操作,select用來查看druid中的存儲的數據,並支持按照指定過濾器和時間段查看指定維度和metric,能通過descending字段指定排序順序,並支持分頁拉取,但不支持aggregations和postAggregations
json 實例如下
{
"dimensions": [
"status",
"is_new_car"
],
"pagingSpec":{
"pagingIdentifiers":{},
"threshold":3
},
"intervals": "1917-08-25T08:35:20+00:00/2017-08-25T08:35:20+00:00",
"dataSource": "app_auto_prem_qd_pp3",
"granularity": "all",
"context" : {
"skipEmptyBuckets" : "true"
},
"queryType": "select"
}
select
相當於SQL語句 select status,is_new_car from app_auto_prem_qd_pp3 limit 3
[ {
"timestamp" : "2017-08-22T14:00:00.000Z",
"result" : {
"pagingIdentifiers" : {
"app_auto_prem_qd_pp3_2017-08-22T08:00:00.000+08:00_2017-08-23T08:00:00.000+08:00_2017-08-22T18:11:01.983+08:00" : 2
},
"dimensions" : [ "is_new_car", "status" ],
"metrics" : [ "total_actual_premium", "count", "total_standard_premium" ],
"events" : [ {
"segmentId" : "app_auto_prem_qd_pp3_2017-08-22T08:00:00.000+08:00_2017-08-23T08:00:00.000+08:00_2017-08-22T18:11:01.983+08:00",
"offset" : 0,
"event" : {
"timestamp" : "2017-08-22T22:00:00.000+08:00",
"status" : null,
"is_new_car" : "是",
"total_actual_premium" : 1012.5399780273438,
"count" : 1,
"total_standard_premium" : 1250.050048828125
}
}, {
"segmentId" : "app_auto_prem_qd_pp3_2017-08-22T08:00:00.000+08:00_2017-08-23T08:00:00.000+08:00_2017-08-22T18:11:01.983+08:00",
"offset" : 1,
"event" : {
"timestamp" : "2017-08-22T22:00:00.000+08:00",
"status" : null,
"is_new_car" : "是",
"total_actual_premium" : 708.780029296875,
"count" : 1,
"total_standard_premium" : 1250.050048828125
}
}, {
"segmentId" : "app_auto_prem_qd_pp3_2017-08-22T08:00:00.000+08:00_2017-08-23T08:00:00.000+08:00_2017-08-22T18:11:01.983+08:00",
"offset" : 2,
"event" : {
"timestamp" : "2017-08-22T22:00:00.000+08:00",
"status" : null,
"is_new_car" : "是",
"total_actual_premium" : 1165.489990234375,
"count" : 1,
"total_standard_premium" : 1692.800048828125
}
} ]
}
} ]
在pagingSpec中指定分頁拉取的offset和條目數,在結果中會返回下次拉取的offset,
"pagingSpec":{
"pagingIdentifiers":{},
"threshold":3,
"fromNext" :true
}
Search
search 查詢返回匹配中的維度,類似於SQL中的topN操作,但是支持更多的匹配操作,
json示例如
{
"queryType": "search",
"dataSource": "app_auto_prem_qd_pp3",
"granularity": "all",
"limit": 2,
"searchDimensions": [
"data_source",
"department_code"
],
"query": {
"type": "insensitive_contains",
"value": "1"
},
"sort" : {
"type": "lexicographic"
},
"intervals": [
"1917-08-25T08:35:20+00:00/2017-08-25T08:35:20+00:00"
]
}
searchDimensions搜索的維度
字段名 | 描述 | 是否必須 |
---|---|---|
queryType | 對於search查詢,該字段必須是search | 是 |
dataSource | 要查詢的數據集 | 是 |
searchDimensions | 運行search的維度 | 是 |
limit | 對統計結果進行限制 | 否(默認1000) |
granularity | 查詢結果進行聚合的時間粒度 | 是 |
intervals | 查詢的時間範圍,默認是ISO-8601格式 | 是 |
sort | 指定搜索結果排序 | 否 |
query | 查詢操作 | 是 |
context | 指定一些查詢參數 | 否 |
filter | 過濾器 | 否 |
需要注意的是,search只是返回匹配中維度,不支持其他聚合操作,如果要將search作爲查詢條件進行topN,groupBy或timeseries等操作,則可以在filter字段中
指定各種過濾方式,filter字段也支持正則匹配,
查詢結果如下:
[ {
"timestamp" : "2017-08-22T08:00:00.000+08:00",
"result" : [ {
"dimension" : "data_source",
"value" : "226931204023",
"count" : 2
}, {
"dimension" : "data_source",
"value" : "226931204055",
"count" : 7
} ]
} ]
查詢的選擇
1、在可能的情況下,建議使用Timeseries和TopN查詢而不是GroupBy,GroupBy是最靈活的查詢,也是最差的表現。對於不需要對維度進行分組的聚合,Timeseries比GroupBy查詢要快,對於單個維度進行分組和排序,TopN查詢比GroupBy更加優化
groupBy 多列聚合group by
{
"queryType": "groupBy",
"dataSource": "bitup",
"dimensions": ["sample_time"],
"granularity": "all",
"filter": {
"type": "and",
"fields": [
{
"type": "selector",
"dimension": "symbol",
"value": "xbtusd"
}
]
},
"intervals": [
"2018-11-28T03:40:00/2018-11-28T03:54:30"
],
"limitSpec": {
"columns": [
{
"dimension": "sample_time",
"direction": "descending",
"dimensionOrder": "numeric"
}
],
"limit": 36000,
"type": "default"
}
}
topN 單 group by
{
"queryType": "topN",
"dataSource": "bitup",
"dimension": "sample_time",
"threshold": 36000,
"metric": "count",
"granularity": "all",
"aggregations": [
{
"type": "count",
"name": "count"
}
],
"intervals": [
"2018-11-28T03:40:00/2018-11-28T03:54:30"
]
}
類似於Select,但不支持分頁,但是如果沒有分頁需求,推薦使用這個,性能比Select好
{
"queryType": "scan",
"dataSource": "bitup",
"resultFormat": "list",
"columns":["symbol","kline_type","sample_time","open","close","high","low","vol","coin_vol","vwap"],
"intervals": [
"2018-11-28T03:40:00/2018-11-28T03:54:30"
],
"batchSize":20480,
"limit":36000
}