大數據Druid部署、Push數據攝入實例

Druid 單機部署

有很多文章都介紹了Druid,大數據實時分析,在此我就不多說了。本文主要描述如何部署Druid的環境,Imply提供了一套完整的部署方式,包括依賴庫,Druid,圖形化的數據展示頁面,SQL查詢組件等,Push攝入數據Tranquility Server配置。

一、環境安裝前準備:

  1. Java 8 https://download.oracle.com/otn-pub/java/jdk/8u191-b12/2787e4a523244c269598db4e85c51e0c/jdk-8u191-linux-x64.tar.gz
  2. Node.js 4.5.x
  3. Linux, Mac OS X (不支持 Windows )
  4. At least 4GB of RAM

二、安裝JAVA 8 :

  1. 新增 Java 目錄 mkdir /usr/local/java
  2. 解壓JDK tar -zxvf jdk-8u191-linux-x64.tar.gz
  3. 配置環境變量
# JAVA_HOME
export JAVA_HOME=/usr/local/java/jdk1.8.0_191
export CLASSPATH=.:$JAVA_HOME/jre/lib/rt.jar:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export PATH=$PATH:$JAVA_HOME/bin
  1. 環境變量需要重啓生效 source /ect/profile
  2. 驗證JDK java -version

三、Node.js 安裝

1、去官網下載和自己系統匹配的文件:

英文網址:https://nodejs.org/en/download/

中文網址:http://nodejs.cn/download/

2、下載下來的tar文件上傳到服務器並且解壓,然後通過建立軟連接變爲全局;

1)上傳服務器可以是自己任意路徑,目前我的放置路徑爲 cd /usr/local/software

2)解壓上傳 tar -xvf node-v10.13.0-linux-x64.tar.xz

3)建立軟連接,變爲全局

  • ln -s /usr/local/software/node-v10.13.0-linux-x64/bin/npm /usr/local/bin/
  • ln -s /usr/local/software/node-v10.13.0-linux-x64/bin/node /usr/local/bin/

4)最後一步檢驗nodejs是否已變爲全局 node -v 說明安裝成功。
在這裏插入圖片描述

三、下載與安裝 imply

  1. https://imply.io/get-started 下載最新版本安裝包
  2. tar -zxvf imply-2.7.12.tar.gz
  3. cd imply-2.7.12
  4. 啓動項目 nohup bin/supervise -c conf/supervise/quickstart.conf > quickstart.log &
    在這裏插入圖片描述
  5. 如果啓動出現上圖 請重新安裝 perl Centos7 下面執行:yum install perl
  6. 重新啓動就好了
    在這裏插入圖片描述

安裝驗證

** 導入測試數據、安裝包中包含一些測試的數據,可以通過執行預先定義好的數據說明文件進行導入 **

# 導入數據,進入  imply-2.7.12 執行下面語句
[root@strom imply-2.7.12]# bin/post-index-task --file quickstart/wikipedia-index.json 
Beginning indexing data for wikipedia
Task started: index_wikipedia_2018-11-22T07:39:13.068Z
Task log:     http://localhost:8090/druid/indexer/v1/task/index_wikipedia_2018-11-22T07:39:13.068Z/log
Task status:  http://localhost:8090/druid/indexer/v1/task/index_wikipedia_2018-11-22T07:39:13.068Z/status
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task index_wikipedia_2018-11-22T07:39:13.068Z still running...
Task finished with status: SUCCESS
Completed indexing data for wikipedia. Now loading indexed data onto the cluster...
wikipedia is 0.0% finished loading...
wikipedia is 0.0% finished loading...
wikipedia is 0.0% finished loading...
wikipedia is 0.0% finished loading...
wikipedia is 0.0% finished loading...
wikipedia loading complete! You may now query your data
[root@strom imply-2.7.12]# 

四、可視化控制檯

在這裏插入圖片描述

五、Druid的數據攝入主要包括兩大類:

1. 實時輸入攝入:包括Pull,Push兩種

  • Pull:需要啓動一個RealtimeNode節點,通過不同的Firehose攝取不同種類的數據源。
  • Push:需要啓動Tranquility或是Kafka索引服務。通過HTTP調用的方式進行數據攝入

2. 實時數據攝入

2.1 Pull

由於Realtime Node 沒有提供高可用,可伸縮等特性,對於比較重要的場景推薦使用 Tranquility Server or 或是Tranquility Kafka索引服務

2.2 Push

通過Tranquility 的數據攝入,可以分爲兩種方式
Tranquility Server:發送方可以通過Tranquility Server 提供的HTTP接口,向Druid發送數據。
Tranquility Kafka:發送發可以先將數據發送到Kafka,Tranquility Kafka會根據配置從Kafka獲取數據,並寫到Druid中。

2.2.1 Tranquility Server配置:

開啓Tranquility Server,在數據節點上編輯conf/supervise/quickstart.conf 文件,將Tranquility Server註釋放開

[root@strom imply-2.7.12]# cd conf/supervise/
[root@strom supervise]# ls
data.conf  master-no-zk.conf  master-with-zk.conf  query.conf  quickstart.conf
[root@strom supervise]# vi quickstart.conf 

:verify bin/verify-java
:verify bin/verify-default-ports
:verify bin/verify-version-check
:kill-timeout 10

!p10 zk bin/run-zk conf-quickstart
coordinator bin/run-druid coordinator conf-quickstart
broker bin/run-druid broker conf-quickstart
historical bin/run-druid historical conf-quickstart
!p80 overlord bin/run-druid overlord conf-quickstart
!p90 middleManager bin/run-druid middleManager conf-quickstart
imply-ui bin/run-imply-ui-quickstart conf-quickstart

# Uncomment to use Tranquility Server  把此處的註釋去掉的
!p95 tranquility-server bin/tranquility server -configFile conf-quickstart/tranquility/server.json

# Uncomment to use Tranquility Kafka
#!p95 tranquility-kafka bin/tranquility kafka -configFile conf-quickstart/tranquility/kafka.json

# Uncomment to use Tranquility Clarity metrics server
#!p95 tranquility-metrics-server java -Xms2g -Xmx2g -cp "dist/tranquility/lib/*:dist/tranquility/conf" com.metamx.tranquility.distribution.DistributionMain server -configFile conf-quickstart/tranquility/server-for-metrics.yaml
:wq!

2.2.2 查看 conf-quickstart/tranquility/server.json

{
  "dataSources" : [
    {
      "spec" : {
        "dataSchema" : {
          "dataSource" : "tutorial-tranquility-server",
          "parser" : {
            "type" : "string",
            "parseSpec" : {
              "timestampSpec" : {
                "column" : "timestamp",
                "format" : "auto"
              },
              "dimensionsSpec" : {
                "dimensions" : [],
                "dimensionExclusions" : [
                  "timestamp",
                  "value"
                ]
              },
              "format" : "json"
            }
          },
          "granularitySpec" : {
            "type" : "uniform",
            "segmentGranularity" : "hour",
            "queryGranularity" : "none"
          },
          "metricsSpec" : [
            {
              "type" : "count",
              "name" : "count"
            },
            {
              "name" : "value_sum",
              "type" : "doubleSum",
              "fieldName" : "value"
            },
            {
              "fieldName" : "value",
              "name" : "value_min",
              "type" : "doubleMin"
            },
            {
              "type" : "doubleMax",
              "name" : "value_max",
              "fieldName" : "value"
            }
          ]
        },
        "ioConfig" : {
          "type" : "realtime"
        },
        "tuningConfig" : {
          "type" : "realtime",
          "maxRowsInMemory" : "50000",
          "intermediatePersistPeriod" : "PT10M",
          "windowPeriod" : "PT10M"
        }
      },
      "properties" : {
        "task.partitions" : "1",
        "task.replicants" : "1"
      }
    }
  ],
  "properties" : {
    "zookeeper.connect" : "localhost",
    "druid.discovery.curator.path" : "/druid/discovery",
    "druid.selectors.indexing.serviceName" : "druid/overlord",
    "http.port" : "8200",
    "http.threads" : "40",
    "serialization.format" : "smile",
    "druidBeam.taskLocator": "overlord"
  }
}
  • “dataSource” : “tutorial-tranquility-server” 可以改成自己需要的 dataSource

2.2.3. 重新啓動項目,首先要down 掉上次啓動程序

[root@strom imply-2.7.12]# bin/service --down
[root@strom imply-2.7.12]# nohup bin/supervise -c conf/supervise/quickstart.conf > quickstart.log &

出現以下信息,證明啓動成功

[root@strom imply-2.7.12]# tail -f quickstart.log 
[Thu Nov 22 16:05:20 2018] Running command[zk], logging to[/usr/local/druid/imply-2.7.12/var/sv/zk.log]: bin/run-zk conf-quickstart
[Thu Nov 22 16:05:20 2018] Running command[coordinator], logging to[/usr/local/druid/imply-2.7.12/var/sv/coordinator.log]: bin/run-druid coordinator conf-quickstart
[Thu Nov 22 16:05:20 2018] Running command[broker], logging to[/usr/local/druid/imply-2.7.12/var/sv/broker.log]: bin/run-druid broker conf-quickstart
[Thu Nov 22 16:05:20 2018] Running command[historical], logging to[/usr/local/druid/imply-2.7.12/var/sv/historical.log]: bin/run-druid historical conf-quickstart
[Thu Nov 22 16:05:20 2018] Running command[overlord], logging to[/usr/local/druid/imply-2.7.12/var/sv/overlord.log]: bin/run-druid overlord conf-quickstart
[Thu Nov 22 16:05:20 2018] Running command[middleManager], logging to[/usr/local/druid/imply-2.7.12/var/sv/middleManager.log]: bin/run-druid middleManager conf-quickstart
[Thu Nov 22 16:05:20 2018] Running command[imply-ui], logging to[/usr/local/druid/imply-2.7.12/var/sv/imply-ui.log]: bin/run-imply-ui-quickstart conf-quickstart
[Thu Nov 22 16:05:20 2018] Running command[tranquility-server], logging to[/usr/local/druid/imply-2.7.12/var/sv/tranquility-server.log]: bin/tranquility server -configFile conf-quickstart/tranquility/server.json

2.2.4. 進行測試類編寫

# HTTP util
import java.io.IOException;
import java.net.SocketTimeoutException;
import java.security.GeneralSecurityException;
import java.security.cert.CertificateException;
import java.security.cert.X509Certificate;
import java.util.ArrayList;
import java.util.List;
import java.util.Map;
import java.util.Map.Entry;
import java.util.Set;

import javax.net.ssl.SSLContext;
import javax.net.ssl.SSLException;
import javax.net.ssl.SSLSession;
import javax.net.ssl.SSLSocket;

import org.apache.commons.io.IOUtils;
import org.apache.commons.lang.StringUtils;
import org.apache.http.Consts;
import org.apache.http.HttpEntity;
import org.apache.http.HttpResponse;
import org.apache.http.NameValuePair;
import org.apache.http.client.HttpClient;
import org.apache.http.client.config.RequestConfig;
import org.apache.http.client.config.RequestConfig.Builder;
import org.apache.http.client.entity.UrlEncodedFormEntity;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.client.methods.HttpPost;
import org.apache.http.conn.ConnectTimeoutException;
import org.apache.http.conn.ssl.SSLConnectionSocketFactory;
import org.apache.http.conn.ssl.SSLContextBuilder;
import org.apache.http.conn.ssl.TrustStrategy;
import org.apache.http.conn.ssl.X509HostnameVerifier;
import org.apache.http.entity.ContentType;
import org.apache.http.entity.StringEntity;
import org.apache.http.impl.client.CloseableHttpClient;
import org.apache.http.impl.client.HttpClients;
import org.apache.http.impl.conn.PoolingHttpClientConnectionManager;
import org.apache.http.message.BasicNameValuePair;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

/**
 * HTTP
 * @Description:
 * @author:DARUI LI
 * @version:1.0.0
 * @Data:2018年11月22日 下午4:52:31
 */
public class HttpUtil {

    public static final int connTimeout = 5000;
    public static final int readTimeout = 5000;
    public static final String charset = "UTF-8";
    private static HttpClient client = null;

    private static Logger logger = LoggerFactory.getLogger(HttpUtil.class);

    static {
        PoolingHttpClientConnectionManager cm = new PoolingHttpClientConnectionManager();
        cm.setMaxTotal(128);
        cm.setDefaultMaxPerRoute(128);
        client = HttpClients.custom().setConnectionManager(cm).build();
    }

    public static String postJson(String url, String json) throws Exception {
        return post(url, json, "application/json", charset, connTimeout, readTimeout);
    }

    public static String postParameters(String url, String parameterStr) throws Exception {
        return post(url, parameterStr, "application/x-www-form-urlencoded", charset, connTimeout, readTimeout);
    }

    public static String postParameters(String url, String parameterStr, String charset, Integer connTimeout, Integer readTimeout) throws Exception {
        return post(url, parameterStr, "application/x-www-form-urlencoded", charset, connTimeout, readTimeout);
    }

    public static String postParameters(String url, Map<String, String> params) throws
            Exception {
        return postForm(url, params, null, connTimeout, readTimeout);
    }

    public static String postParameters(String url, Map<String, String> params, Integer connTimeout, Integer readTimeout) throws
            Exception {
        return postForm(url, params, null, connTimeout, readTimeout);
    }

    public static String get(String url) throws Exception {
        return get(url, charset, null, null);
    }

    public static String get(String url, String charset) throws Exception {
        return get(url, charset, connTimeout, readTimeout);
    }

    /**
     * 發送一個 Post 請求, 使用指定的字符集編碼.
     *
     * @param url
     * @param body        RequestBody
     * @param mimeType    例如 application/xml "application/x-www-form-urlencoded" a=1&b=2&c=3
     * @param charset     編碼
     * @param connTimeout 建立鏈接超時時間,毫秒.
     * @param readTimeout 響應超時時間,毫秒.
     * @return ResponseBody, 使用指定的字符集編碼.
     * @throws ConnectTimeoutException 建立鏈接超時異常
     * @throws SocketTimeoutException  響應超時
     * @throws Exception
     */
    public static String post(String url, String body, String mimeType, String charset, Integer connTimeout, Integer readTimeout)
            throws Exception {
        long startTime = System.currentTimeMillis();
        HttpClient client = null;
        HttpPost post = new HttpPost(url);
        String result = "";
        try {
            if (StringUtils.isNotBlank(body)) {
                HttpEntity entity = new StringEntity(body, ContentType.create(mimeType));
//                HttpEntity entity = new StringEntity(body, ContentType.create(mimeType, charset));
                post.setEntity(entity);
            }
            // 設置參數
            Builder customReqConf = RequestConfig.custom();
            if (connTimeout != null) {
                customReqConf.setConnectTimeout(connTimeout);
            }
            if (readTimeout != null) {
                customReqConf.setSocketTimeout(readTimeout);
            }
            post.setConfig(customReqConf.build());

            HttpResponse res;
            if (url.startsWith("https")) {
                client = createSSLInsecureClient();
                res = client.execute(post);
            } else {
                client = HttpUtil.client;
                res = client.execute(post);
            }
            result = IOUtils.toString(res.getEntity().getContent(), charset);

            long endTime = System.currentTimeMillis();
            logger.info("HttpClient Method:[post] ,URL[" + url + "] ,Time:[" + (endTime - startTime) + "ms] ,result:" + result);
        } finally {
            post.releaseConnection();
            if (url.startsWith("https") && client != null && client instanceof CloseableHttpClient) {
                ((CloseableHttpClient) client).close();
            }
        }
        return result;
    }


    /**
     * 提交form表單
     *
     * @param url
     * @param params
     * @param connTimeout
     * @param readTimeout
     * @return
     * @throws ConnectTimeoutException
     * @throws SocketTimeoutException
     * @throws Exception
     */
    public static String postForm(String url, Map<String, String> params, Map<String, String> headers, Integer connTimeout, Integer readTimeout) throws
            Exception {
        long startTime = System.currentTimeMillis();
        HttpClient client = null;
        HttpPost post = new HttpPost(url);
        String result = "";
        try {
            if (params != null && !params.isEmpty()) {
                List<NameValuePair> formParams = new ArrayList<NameValuePair>();
                Set<Entry<String, String>> entrySet = params.entrySet();
                for (Entry<String, String> entry : entrySet) {
                    formParams.add(new BasicNameValuePair(entry.getKey(), entry.getValue()));
                }
                UrlEncodedFormEntity entity = new UrlEncodedFormEntity(formParams, Consts.UTF_8);
                post.setEntity(entity);
            }

            if (headers != null && !headers.isEmpty()) {
                for (Entry<String, String> entry : headers.entrySet()) {
                    post.addHeader(entry.getKey(), entry.getValue());
                }
            }
            // 設置參數  
            Builder customReqConf = RequestConfig.custom();
            if (connTimeout != null) {
                customReqConf.setConnectTimeout(connTimeout);
            }
            if (readTimeout != null) {
                customReqConf.setSocketTimeout(readTimeout);
            }
            post.setConfig(customReqConf.build());
            HttpResponse res = null;
            if (url.startsWith("https")) {
                // 執行 Https 請求.  
                client = createSSLInsecureClient();
                res = client.execute(post);
            } else {
                // 執行 Http 請求.  
                client = HttpUtil.client;
                res = client.execute(post);
            }
            result = IOUtils.toString(res.getEntity().getContent(), charset);

            long endTime = System.currentTimeMillis();
            logger.info("HttpClient Method:[postForm] ,URL[" + url + "] ,Time:[" + (endTime - startTime) + "ms] ,result:" + result);
        } finally {
            post.releaseConnection();
            if (url.startsWith("https") && client != null && client instanceof CloseableHttpClient) {
                ((CloseableHttpClient) client).close();
            }
        }
        return result;
    }


    /**
     * 發送一個 GET 請求
     *
     * @param url
     * @param charset
     * @param connTimeout 建立鏈接超時時間,毫秒.
     * @param readTimeout 響應超時時間,毫秒.
     * @return
     * @throws ConnectTimeoutException 建立鏈接超時
     * @throws SocketTimeoutException  響應超時
     * @throws Exception
     */
    public static String get(String url, String charset, Integer connTimeout, Integer readTimeout)
            throws Exception {
        long startTime = System.currentTimeMillis();
        HttpClient client = null;
        HttpGet get = new HttpGet(url);
        String result = "";
        try {
            // 設置參數  
            Builder customReqConf = RequestConfig.custom();
            if (connTimeout != null) {
                customReqConf.setConnectTimeout(connTimeout);
            }
            if (readTimeout != null) {
                customReqConf.setSocketTimeout(readTimeout);
            }
            get.setConfig(customReqConf.build());

            HttpResponse res = null;

            if (url.startsWith("https")) {
                // 執行 Https 請求.  
                client = createSSLInsecureClient();
                res = client.execute(get);
            } else {
                // 執行 Http 請求.  
                client = HttpUtil.client;
                res = client.execute(get);
            }
            result = IOUtils.toString(res.getEntity().getContent(), charset);

            long endTime = System.currentTimeMillis();
            logger.info("HttpClient Method:[postForm] ,URL[" + url + "] ,Time:[ " + (endTime - startTime) + "ms ] ,result:" + result);
        } finally {
            get.releaseConnection();
            if (url.startsWith("https") && client != null && client instanceof CloseableHttpClient) {
                ((CloseableHttpClient) client).close();
            }
        }
        return result;
    }


    /**
     * 從 response 裏獲取 charset
     *
     * @param ressponse
     * @return
     */
    @SuppressWarnings("unused")
    private static String getCharsetFromResponse(HttpResponse ressponse) {
        if (ressponse.getEntity() != null && ressponse.getEntity().getContentType() != null && ressponse.getEntity().getContentType().getValue() != null) {
            String contentType = ressponse.getEntity().getContentType().getValue();
            if (contentType.contains("charset=")) {
                return contentType.substring(contentType.indexOf("charset=") + 8);
            }
        }
        return null;
    }


    /**
     * 創建 SSL連接
     *
     * @return
     * @throws GeneralSecurityException
     */
    private static CloseableHttpClient createSSLInsecureClient() throws GeneralSecurityException {
        try {
            SSLContext sslContext = new SSLContextBuilder().loadTrustMaterial(null, new TrustStrategy() {
                public boolean isTrusted(X509Certificate[] chain, String authType) throws CertificateException {
                    return true;
                }
            }).build();

            SSLConnectionSocketFactory sslsf = new SSLConnectionSocketFactory(sslContext, new X509HostnameVerifier() {

                public boolean verify(String arg0, SSLSession arg1) {
                    return true;
                }

                public void verify(String host, SSLSocket ssl) throws IOException {
                }

                public void verify(String host, X509Certificate cert) throws SSLException {
                }

                public void verify(String host, String[] cns, String[] subjectAlts) throws SSLException {
                }

            });
            return HttpClients.custom().setSSLSocketFactory(sslsf).build();
        } catch (GeneralSecurityException e) {
            throw e;
        }
    }
}
/**
 * Druid Tranquility Server http 請求多線程 demo
 * @Description:
 * @author:DARUI LI
 * @version:1.0.0
 * @Data:2018年11月22日 下午4:55:33
 */
public class DruidThreadTest {
	private static final int THREADNUM = 10;// 線程數量

	public static void main(String[] args) {
		// 線程數量
		int threadmax = THREADNUM;
		for (int i = 0; i < threadmax; i++) {
			ThreadMode thread = new ThreadMode();
			thread.getThread().start();
		}
	}
}
import java.util.Map;

import org.joda.time.DateTime;

import com.alibaba.fastjson.JSON;
import com.bitup.strom.uitl.HttpUtil;
import com.google.common.collect.ImmutableMap;

/**
 * 執行程序 多線程訪問
 * @Description:
 * @author:DARUI LI
 * @version:1.0.0
 * @Data:2018年11月22日 下午4:57:49
 */
public class ThreadMode {
	public Thread getThread() {
		Thread thread = new Thread(new Runnable() {
			@Override
			public void run() {
				long start = System.currentTimeMillis();
				for (int i = 0; i < 10; i++) {
					System.out.print("\nout:" + i);
					final Map<String, Object> obj = ImmutableMap.<String, Object>of("timestamp", new DateTime().toString(),"test5",i);
					try {
						String postJson = HttpUtil.postJson("http://192.168.162.136:8200/v1/post/tutorial-tranquility-server", JSON.toJSONString(obj));
						 System.err.println(postJson);
					} catch (Exception e) {
						e.printStackTrace();
					}
				}
				long end = System.currentTimeMillis();  
				System.out.println("start time:" + start+ "; end time:" + end+ "; Run Time:" + (end - start) + "(ms)");
			}
		});
		return thread;
	}
}

2.2.5 Tranquility Kafka配置:

開啓Tranquility Kafka,在數據節點上編輯conf/supervise/quickstart.conf 文件,將Tranquility Kafka註釋放開

[root@strom imply-2.7.12]# cd conf/supervise/
[root@strom supervise]# ls
data.conf  master-no-zk.conf  master-with-zk.conf  query.conf  quickstart.conf
[root@strom supervise]# vi quickstart.conf 

:verify bin/verify-java
:verify bin/verify-default-ports
:verify bin/verify-version-check
:kill-timeout 10

!p10 zk bin/run-zk conf-quickstart
coordinator bin/run-druid coordinator conf-quickstart
broker bin/run-druid broker conf-quickstart
historical bin/run-druid historical conf-quickstart
!p80 overlord bin/run-druid overlord conf-quickstart
!p90 middleManager bin/run-druid middleManager conf-quickstart
imply-ui bin/run-imply-ui-quickstart conf-quickstart

# Uncomment to use Tranquility Server  
#!p95 tranquility-server bin/tranquility server -configFile conf-quickstart/tranquility/server.json

# Uncomment to use Tranquility Kafka 把此處的註釋去掉的
!p95 tranquility-kafka bin/tranquility kafka -configFile conf-quickstart/tranquility/kafka.json

# Uncomment to use Tranquility Clarity metrics server
#!p95 tranquility-metrics-server java -Xms2g -Xmx2g -cp "dist/tranquility/lib/*:dist/tranquility/conf" com.metamx.tranquility.distribution.DistributionMain server -configFile conf-quickstart/tranquility/server-for-metrics.yaml
:wq!

2.2.6 詳細配置可參考:

http://druid.io/docs/0.10.1/tutorials/tutorial-kafka.html

配置參考

通用配置:https://github.com/druid-io/tranquility/blob/master/docs/configuration.md
數據攝入通用配置:http://druid.io/docs/latest/ingestion/index.html
Tranquility Kafka:https://github.com/druid-io/tranquility/blob/master/docs/kafka.md

Druid 查詢數據

1、基本sql查詢

druid 查詢接口的使用
druid的查詢接口是HTTP REST 風格的查詢方式,使用HTTP REST 風格查詢(Broker,Historical,或者Realtime)節點的數據,查詢參數爲JSON格式,每個節點類型都會暴露相同的REST查詢接口

curl -X POST '<queryable_host>:<port>/druid/v2/?pretty' -H 'Content-Type:application/json' -d @<query_json_file>
queryable_host: broker節點ip port: broker 節點端口 默認是8082
curl -L -H'Content-Type: application/json' -XPOST --data-binary @quickstart/aa.json http://10.20.23.41:8082/druid/v2/?pretty

query 查詢的類型有

1、Timeseries
2、TopN
3、GroupBy
4、Time Boundary
5、Segment Metadata
6、Datasource Metadata
7、Search
8、select
其中 Timeseries、TopN、GroupBy爲聚合查詢,Time Boundary、Segment Metadata、Datasource Metadata 爲元數據查詢,Search 爲搜索查詢
1、Timeseries
對於需要統計一段時間內的彙總數據,或者是指定時間粒度的彙總數據,druid可以通過Timeseries來完成。
timeseries 查詢包括如下的字段:

字段名 描述 是否必須
queryType 查詢類型,這裏只有填寫timeseries查詢
dataSource 要查詢的數據集
descending 是否降序
queryType 查詢類型,這裏只有填寫timeseries查詢
intervals 查詢的時間範圍,默認是ISO-8601格式
granularity 查詢結果進行聚合的時間粒度
filter 過濾條件
aggregations 聚合
postAggregations 後期聚合
context 指定一些查詢參數
granularity 查詢結果進行聚合的時間粒度
timeseries輸出每個時間粒度內指定條件的統計信息,通過filter指定條件過濾,通過aggregations和postAggregations指定聚合方式。
timeseries不能輸出維度信息,granularity支持all,none,second,minute,hour,day,week,month,year等維度

all:彙總1條輸出 none:不推薦使用

其他的:則輸出相應粒度統計信息

查詢的json

{
  "aggregations": [
    {
      "type": "count", 
      "name": "count"
    }
  ], 
  "intervals": "1917-08-25T08:35:20+00:00/2017-08-25T08:35:20+00:00", 
  "dataSource": "app_auto_prem_qd_pp3", 
  "granularity": "all", 
  "postAggregations": [], 
  "queryType": "timeseries"
}

等同於sql select count(1) from app_auto_prem_qd_pp3
TopN 返回指定維度和排序字段的有序top-n序列.TopN支持返回前N條記錄,並支持指定的Metric爲排序依據

{
  "metric": "sum__total_standard_premium", 
  "aggregations": [
    {
      "type": "doubleSum", 
      "fieldName": "total_standard_premium", 
      "name": "sum__total_standard_premium"
    }
  ], 
  "dimension": "is_new_car", 
  "intervals": "1917-08-29T20:05:10+00:00/2017-08-29T20:05:10+00:00", 
  "dataSource": "app_auto_prem_qd_pp3", 
  "granularity": "all", 
  "threshold": 50000, 
  "postAggregations": [], 
  "queryType": "topN"
}
字段名 描述 是否必須
queryType 對於TopN查詢,這個必須是TopN
dataSource 要查詢的數據集
intervals 查詢的時間範圍,默認是ISO-8601格式
filter 過濾條件
aggregations 聚合
postAggregations 後期聚合
dimension 進行TopN查詢的維護,一個TopN查詢只能有一個維度
threshold TopN中的N值
metric 進行統計並排序的metric
context 指定一些查詢參數

metric:是TopN專屬
方式:

"metric":"<metric_name>" 默認情況是升序排序的

"metric" : {
    "type" : "numeric", //指定按照numeric 降序排序
    "metric" : "<metric_name>"
}

"metric" : {
    "type" : "inverted", //指定按照numeric 升序排序
    "metric" : "<metric_name>"
}
"metric" : {
    "type" : "lexicographic", //指定按照字典序排序
    "metric" : "<metric_name>"
}
"metric" : {
    "type" : "alphaNumeric", //指定按照數字排序
    "metric" : "<metric_name>"
}

需要注意的是,TopN是一個近似算法,每一個segment返回前1000條進行合併得到最後的結果,如果dimension
的基數在1000以內,則是準確的,超過1000就是近似值

groupBy

groupBy 類似於SQL中的group by 操作,能對指定的多個維度進行分組,也支持對指定的維度進行排序,並輸出limit行數,同時支持having操作

{
  "dimensions": [
    "is_new_car", 
    "status"
  ], 
  "aggregations": [
    {
      "type": "doubleSum", 
      "fieldName": "total_standard_premium", 
      "name": "sum__total_standard_premium"
    }
  ], 
  "having": {
    "type": "greaterThan", 
    "aggregation": "sum__total_standard_premium", 
    "value": "484000"
  }, 
  "intervals": "1917-08-29T20:26:52+00:00/2017-08-29T20:26:52+00:00", 
  "limitSpec": {
    "limit": 2, 
    "type": "default", 
    "columns": [
      {
        "direction": "descending", 
        "dimension": "sum__total_standard_premium"
      }
    ]
  }, 
  "granularity": "all", 
  "postAggregations": [], 
  "queryType": "groupBy", 
  "dataSource": "app_auto_prem_qd_pp3"
}
等同於SQL select is_new_car,status,sum(total_standard_premium) from app_auto_prem_qd_pp3 group by is_new_car,status limit 50000 having sum(total_standard_premium)>484000
{
  "version" : "v1",
  "timestamp" : "1917-08-30T04:26:52.000+08:00",
  "event" : {
    "sum__total_standard_premium" : 8.726074368E9,
    "is_new_car" : "是",
    "status" : null
  }
}, {
  "version" : "v1",
  "timestamp" : "1917-08-30T04:26:52.000+08:00",
  "event" : {
    "sum__total_standard_premium" : 615152.0,
    "is_new_car" : "否",
    "status" : null
  }
  }
字段名 描述 是否必須
queryType 對於GroupBy查詢,該字段必須是GroupBy
dataSource 要查詢的數據集
dimensions 進行GroupBy查詢的維度集合
limitSpec 統計結果進行排序
having 對統計結果進行篩選
granularity 查詢結果進行聚合的時間粒度
postAggregations 後聚合器
intervals 查詢的時間範圍,默認是ISO-8601格式
context 指定一些查詢參數

GroupBy特有的字段爲limitSpec 和having

limitSpec 指定排序規則和limit的行數

{
    "type" : "default",
    "limit":<integer_value>,
    "columns":[list of OrderByColumnSpec]
}

其中columns是一個數組,可以指定多個排序字段,排序字段可以使demension 或者metric 指定排序規則的拼寫方式

{
    "dimension" :"<Any dimension or metric name>",  
    "direction" : <"ascending"|"descending">
}

 "limitSpec": {
    "limit": 2, 
    "type": "default", 
    "columns": [
      {
        "direction": "descending", 
        "dimension": "sum__total_standard_premium"
      },
     {
        "direction": "ascending", 
        "dimension": "is_new_car"
      } 
    ]
  }
having 類似於SQL中的having操作

select 類似於sql中select操作,select用來查看druid中的存儲的數據,並支持按照指定過濾器和時間段查看指定維度和metric,能通過descending字段指定排序順序,並支持分頁拉取,但不支持aggregations和postAggregations
json 實例如下

{
  "dimensions": [
      "status",
      "is_new_car"
  ], 
  "pagingSpec":{
  "pagingIdentifiers":{},
  "threshold":3
  },
  "intervals": "1917-08-25T08:35:20+00:00/2017-08-25T08:35:20+00:00", 
  "dataSource": "app_auto_prem_qd_pp3", 
  "granularity": "all", 
  "context" : {
   "skipEmptyBuckets" : "true"
  },
  "queryType": "select"
}

select

相當於SQL語句 select status,is_new_car from app_auto_prem_qd_pp3 limit 3

[ {
  "timestamp" : "2017-08-22T14:00:00.000Z",
  "result" : {
    "pagingIdentifiers" : {
      "app_auto_prem_qd_pp3_2017-08-22T08:00:00.000+08:00_2017-08-23T08:00:00.000+08:00_2017-08-22T18:11:01.983+08:00" : 2
    },
    "dimensions" : [ "is_new_car", "status" ],
    "metrics" : [ "total_actual_premium", "count", "total_standard_premium" ],
    "events" : [ {
      "segmentId" : "app_auto_prem_qd_pp3_2017-08-22T08:00:00.000+08:00_2017-08-23T08:00:00.000+08:00_2017-08-22T18:11:01.983+08:00",
      "offset" : 0,
      "event" : {
        "timestamp" : "2017-08-22T22:00:00.000+08:00",
        "status" : null,
        "is_new_car" : "是",
        "total_actual_premium" : 1012.5399780273438,
        "count" : 1,
        "total_standard_premium" : 1250.050048828125
      }
    }, {
      "segmentId" : "app_auto_prem_qd_pp3_2017-08-22T08:00:00.000+08:00_2017-08-23T08:00:00.000+08:00_2017-08-22T18:11:01.983+08:00",
      "offset" : 1,
      "event" : {
        "timestamp" : "2017-08-22T22:00:00.000+08:00",
        "status" : null,
        "is_new_car" : "是",
        "total_actual_premium" : 708.780029296875,
        "count" : 1,
        "total_standard_premium" : 1250.050048828125
      }
    }, {
      "segmentId" : "app_auto_prem_qd_pp3_2017-08-22T08:00:00.000+08:00_2017-08-23T08:00:00.000+08:00_2017-08-22T18:11:01.983+08:00",
      "offset" : 2,
      "event" : {
        "timestamp" : "2017-08-22T22:00:00.000+08:00",
        "status" : null,
        "is_new_car" : "是",
        "total_actual_premium" : 1165.489990234375,
        "count" : 1,
        "total_standard_premium" : 1692.800048828125
      }
    } ]
  }
} ]

在pagingSpec中指定分頁拉取的offset和條目數,在結果中會返回下次拉取的offset,

 "pagingSpec":{
  "pagingIdentifiers":{},
  "threshold":3,
  "fromNext" :true
  }

Search

search 查詢返回匹配中的維度,類似於SQL中的topN操作,但是支持更多的匹配操作,
json示例如

{
  "queryType": "search",
  "dataSource": "app_auto_prem_qd_pp3",
  "granularity": "all",
  "limit": 2,
  "searchDimensions": [
    "data_source",
    "department_code"
  ],
  "query": {
    "type": "insensitive_contains",
    "value": "1"
  },
  "sort" : {
    "type": "lexicographic"
  },
  "intervals": [
    "1917-08-25T08:35:20+00:00/2017-08-25T08:35:20+00:00"
  ]
} 

searchDimensions搜索的維度

字段名 描述 是否必須
queryType 對於search查詢,該字段必須是search
dataSource 要查詢的數據集
searchDimensions 運行search的維度
limit 對統計結果進行限制 否(默認1000)
granularity 查詢結果進行聚合的時間粒度
intervals 查詢的時間範圍,默認是ISO-8601格式
sort 指定搜索結果排序
query 查詢操作
context 指定一些查詢參數
filter 過濾器

需要注意的是,search只是返回匹配中維度,不支持其他聚合操作,如果要將search作爲查詢條件進行topN,groupBy或timeseries等操作,則可以在filter字段中
指定各種過濾方式,filter字段也支持正則匹配,
查詢結果如下:

[ {
  "timestamp" : "2017-08-22T08:00:00.000+08:00",
  "result" : [ {
    "dimension" : "data_source",
    "value" : "226931204023",
    "count" : 2
  }, {
    "dimension" : "data_source",
    "value" : "226931204055",
    "count" : 7
  } ]
} ]

查詢的選擇

1、在可能的情況下,建議使用Timeseries和TopN查詢而不是GroupBy,GroupBy是最靈活的查詢,也是最差的表現。對於不需要對維度進行分組的聚合,Timeseries比GroupBy查詢要快,對於單個維度進行分組和排序,TopN查詢比GroupBy更加優化

groupBy 多列聚合group by

{
  "queryType": "groupBy",
  "dataSource": "bitup",
  "dimensions": ["sample_time"],
  "granularity": "all",
  "filter": {
    "type": "and",
    "fields": [
      {
        "type": "selector",
        "dimension": "symbol",
        "value": "xbtusd"
      }
    ]
  },
  "intervals": [
    "2018-11-28T03:40:00/2018-11-28T03:54:30"
  ],
   "limitSpec": {
      "columns": [
          {   
              "dimension": "sample_time",
              "direction": "descending",
              "dimensionOrder": "numeric"
          }
      ],
      "limit": 36000,
      "type": "default"
  }
}

topN 單 group by

{
  "queryType": "topN",
  "dataSource": "bitup",
  "dimension": "sample_time",
  "threshold": 36000,
  "metric": "count",
  "granularity": "all",   
  "aggregations": [
    {
      "type": "count",
      "name": "count"
    }
  ],
  "intervals": [
    "2018-11-28T03:40:00/2018-11-28T03:54:30"
  ]
}

類似於Select,但不支持分頁,但是如果沒有分頁需求,推薦使用這個,性能比Select好

{
   "queryType": "scan",
   "dataSource": "bitup",
   "resultFormat": "list",
   "columns":["symbol","kline_type","sample_time","open","close","high","low","vol","coin_vol","vwap"],
   "intervals": [
     "2018-11-28T03:40:00/2018-11-28T03:54:30"
   ],
   "batchSize":20480,
   "limit":36000
 }

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章