ES6.7以及Kibana的安裝.

ES簡介:

Elasticsearch(通常簡稱爲ES)是一個高度可擴展的開源全文搜索和分析引擎。它允許您快速，近實時地存儲，搜索和分析大量數據。它通常用作底層引擎/技術，爲具有複雜搜索功能和要求的應用程序提供支持,本身擴展性很好，可以擴展到上百臺服務器，處理PB級別的數據.
Lucene與ES關係:
- Lucene只是一個庫。想要使用它，你必須使用Java來作爲開發語言並將其直接集成到你的應用中，更糟糕的是，Lucene非常複雜，你需要深入瞭解檢索的相關知識來理解它是如何工作的。
- Elasticsearch也使用Java開發並使用Lucene作爲其核心來實現所有索引和搜索的功能，但是它的目的是通過簡單的RESTful API來隱藏Lucene的複雜性，從而讓全文搜索變得簡單。

ES的工作原理:

當ElasticSearch的節點啓動後，它會利用多播(multicast)(或者單播，如果用戶更改了配置)尋找集羣中的其它節點，並與之建立連接。這個過程如下圖所示：

ES中的基礎概念:

Cluster：集羣
- ES可以作爲一個獨立的單個搜索服務器。不過，爲了處理大型數據集，實現容錯和高可用性，ES可以運行在許多互相合作的服務器上。這些服務器的集合稱爲集羣。
Node:節點
- 形成集羣的每個服務器稱爲節點。
Shard：分片
- 當有大量的文檔時，由於內存的限制、磁盤處理能力不足、無法足夠快的響應客戶端的請求等，一個節點可能不夠。這種情況下，數據可以分爲較小的分片。每個分片放到不同的服務器上。
  當你查詢的索引分佈在多個分片上時，ES會把查詢發送給每個相關的分片，並將結果組合在一起，而應用程序並不知道分片的存在。即：這個過程對用戶來說是透明的。
Replia：副本
- 爲提高查詢吞吐量或實現高可用性，可以使用分片副本。
- 副本是一個分片的精確複製，每個分片可以有零個或多個副本。ES中可以有許多相同的分片，其中之一被選擇更改索引操作，這種特殊的分片稱爲主分片。
- 當主分片丟失時，如：該分片所在的數據不可用時，集羣將副本提升爲新的主分片。
全文檢索
- 全文檢索就是對一篇文章進行索引，可以根據關鍵字搜索，類似於mysql裏的like語句。
  全文索引就是把內容根據詞的意義進行分詞，然後分別創建索引，例如”你們的激情是因爲什麼事情來的” 可能會被分詞成：“你們“，”激情“，“什麼事情“，”來“ 等token，這樣當你搜索“你們” 或者 “激情” 都會把這句搜出來

關係數據庫MySQL對比

ES 6.7安裝

環境:
- Centos7
- JDK8
- ES 6.7
管理ES的用戶,建議使用一個單獨的用戶來管理ES集羣.同時因爲安全問題,es不讓用es來啓動集羣,需要使用其他非root用戶來啓動.
下載ES wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.7.0.tar.gz
解壓:我的安裝目錄:/soft/es 解壓安裝包: tar -zxvf elasticsearch-6.7.0.tar.gz
修改配置文件:elasticsearch.yml

# ======================== Elasticsearch Configuration =========================
#
# NOTE: Elasticsearch comes with reasonable defaults for most settings.
#       Before you set out to tweak and tune the configuration, make sure you
#       understand what are you trying to accomplish and the consequences.
#
# The primary way of configuring a node is via this file. This template lists
# the most important settings you may want to configure for a production cluster.
#
# Please consult the documentation for further information on configuration options:
# https://www.elastic.co/guide/en/elasticsearch/reference/index.html
#
# ---------------------------------- Cluster -----------------------------------
#
# Use a descriptive name for your cluster:
# 集羣名稱
cluster.name: es-cluster
#
# ------------------------------------ Node ------------------------------------
#
# Use a descriptive name for the node:
# 節點名稱,可以設置爲當前節點所在的主機名
node.name: vhost1
#
# Add custom attributes to the node:
# 可以自定義節點屬性
#node.attr.rack: r1
#
# ----------------------------------- Paths ------------------------------------
#
# Path to directory where to store the data (separate multiple locations by comma):
# 存儲數據的目錄,確保管理es集權的用戶有權限讀寫該目錄
path.data: /data/es-data
#
# Path to log files:
# log日誌目錄,確保管理es集權的用戶有權限讀寫該目錄
path.logs: /soft/es/elasticsearch-6.7.0/logs
#
# ----------------------------------- Memory -----------------------------------
#
# Lock the memory on startup:
#
#bootstrap.memory_lock: true
#
# Make sure that the heap size is set to about half the memory available
# on the system and that the owner of the process is allowed to use this
# limit.
#
# Elasticsearch performs poorly when the system is swapping the memory.
#
# ---------------------------------- Network -----------------------------------
#
# Set the bind address to a specific IP (IPv4 or IPv6):
# 當前節點的ip地址,可以使用主機名
network.host: vhost1
#
# Set a custom port for HTTP:
# 端口號,使用默認的9200 不用修改
#http.port: 9200
#
# For more information, consult the network module documentation.
#
# --------------------------------- Discovery ----------------------------------
#
# Pass an initial list of hosts to perform discovery when new node is started:
# The default list of hosts is ["127.0.0.1", "[::1]"]
# 定義可被發現的節點列表
discovery.zen.ping.unicast.hosts: ["vhost1:9300", "vhost2:9300", "vhost3:9300"]
#
# Prevent the "split brain" by configuring the majority of nodes (total number of master-eligible nodes / 2 + 1):
# master 選舉最少的節點數，這個一定要設置爲N/2+1，其中N是：具有master資格的節的數量，而不是整個集羣節點個數
#discovery.zen.minimum_master_nodes: 
#
# For more information, consult the zen discovery module documentation.
#
# ---------------------------------- Gateway -----------------------------------
#
# Block initial recovery after a full cluster restart until N nodes are started:
#
#gateway.recover_after_nodes: 3
#
# For more information, consult the gateway module documentation.
#
# ---------------------------------- Various -----------------------------------
#
# Require explicit names when deleting indices:
#
#action.destructive_requires_name: true
#
# 其他配置
# 
# 是否存儲數據
node.data: true

# 是否參與master的選舉
node.master: true

配置完以後,分發到其他節點
- 在安裝包的上層目錄下,如我的則是在/soft下
- 執行拷貝(之前執行確保其他節點又相同的用戶)
  - scp -r es vhost2$PWD 拷貝到vhost2節點,使用$PWD表示拷貝到相同路徑下
  - scp -r es vhost3$PWD 拷貝到vhost3節點
拷貝完成後,修改拷貝後節點下的elasticsearch.yml文件
- vhost2節點:node.name: vhost2 network.host: vhost2
- vhost3節點:node.name: vhost3 network.host: vhost3

啓動集羣

進入到es的安裝目錄下,在每個節點分別執行./bin/elasticsearch 這個地方也可以自己寫一個腳本來管理集羣的啓停.

啓動集羣時可能會報如下錯誤.

org.elasticsearch.bootstrap.StartupException: java.lang.RuntimeException: can not run elasticsearch as root
- 解決方案:因爲安全問題elasticsearch不讓用root用戶直接運行，所以要創建新用戶
ERROR: [3] bootstrap checks failed [1]: max file descriptors [4096] for elasticsearch process is too low, increase to at least [65536]
- 解決方案:
  - vi /etc/security/limits.conf
  - 文件末尾追加:
    - es soft nofile 819200 #es爲啓動es的用戶
    - es hard nofile 819200 #es爲啓動es的用戶
ERROR: [1] bootstrap checks failed [1]: max number of threads [3802] for user [es] is too low, increase to at least [4096]
- 解決方案:
  - vim /etc/security/limits.d/20-nproc.conf
  - 在文件末尾添加:
    - soft nproc 4096
    - hard nproc 4096
    - root soft nproc unlimited
ERROR: [1] bootstrap checks failed [1]: max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]
- 解決方案:
  - 切換到root用戶修改配置sysctl.conf
  - vi /etc/sysctl.conf
  - 添加下面配置：vm.max_map_count=655360
  - 並執行命令： sysctl -p

注意:如果修改的是系統文件,需要切換成root用戶,修改後,啓動的時候,記得切換爲管理es的用戶.

安裝Kibana插件

Kibana插件

kibana是一個與elasticsearch一起工作的開源的分析和可視化的平臺。使用kibana可以查詢、查看並與存儲在elasticsearch索引的數據進行交互操作。使用kibana能執行高級的數據分析，並能以圖表、表格和地圖的形式查看數據。
kibana使得理解大容量的數據變得非常容易。它非常簡單，基於瀏覽器的接口使我們能夠快速的創建和分享顯示elasticsearch查詢結果實時變化的儀表盤。

安裝

下載 Kibana插件,這裏需要強調一下,Kibana的版本需要與ES的版本對應,否則連接ES的時候可能會報版本不匹配的錯誤
wget https://artifacts.elastic.co/downloads/kibana/kibana-6.7.0-linux-x86_64.tar.gz
解壓 tar -zxvf kibana-6.7.0-linux-x86_64.tar.gz
進入到kibana目錄,並修改config目錄下面的配置文件:kibana.yml
server.host: "vhost1" #kibana服務所在的主機,我的當前解壓到了vhost1機器上,所以這裏爲vhost1
elasticsearch.url: "http://vhost1:9200" # kibana監聽的es集羣
啓動kibana : ./bin/kibana,啓動成功後,會顯示kibana的訪問地址:http://vhost1:5601

TODO

好了,以上就是ES以及Kibana的安裝簡單教程.後面還會更新ES的各種操作.

ES6.7以及Kibana的安裝.

ES簡介:

ES的工作原理:

ES中的基礎概念:

關係數據庫MySQL對比

ES 6.7安裝

啓動集羣

安裝Kibana插件

Kibana插件

安裝

TODO

kerberos + Ranger 實現對Kafka的認證以及權限管理

發佈開源項目至maven中央倉庫，內附打scala源碼包，scala doc 包的教程。

Hive on Spark 搭建過程(hvie-2.3.6 spark-2.4.4 hadoop-2.8.5)

MapReduce 二次排序

深入理解G1GC日誌

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結