Docker安裝ElasticSearch

下載ealastic search和kibana

docker pull elasticsearch:7.4.2
docker pull kibana:7.4.2

配置

mkdir -p /mydata/elasticsearch/config
mkdir -p /mydata/elasticsearch/data
echo "http.host: 0.0.0.0" >/mydata/elasticsearch/config/elasticsearch.yml
chmod -R 777 /mydata/elasticsearch/

啓動Elastic search

docker run --name elasticsearch -p 9200:9200 -p 9300:9300 \
-e  "discovery.type=single-node" \
-e ES_JAVA_OPTS="-Xms64m -Xmx512m" \
-v /mydata/elasticsearch/config/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml \
-v /mydata/elasticsearch/data:/usr/share/elasticsearch/data \
-v  /mydata/elasticsearch/plugins:/usr/share/elasticsearch/plugins \
-d elasticsearch:7.4.2

設置開機啓動elasticsearch

docker update elasticsearch --restart=always

啓動kibana

docker run --name kibana -e ELASTICSEARCH_HOSTS=http://192.168.31.2:9200 -p 5601:5601 -d kibana:7.4.2

設置開機啓動kibana

docker update kibana  --restart=always

查看elasticsearch版本信息： http://192.168.31.2:9200/

{
  "name" : "56414c08186c",
  "cluster_name" : "elasticsearch",
  "cluster_uuid" : "VoAPfguwSwez2S5rIvJYgA",
  "version" : {
    "number" : "7.4.2",
    "build_flavor" : "default",
    "build_type" : "docker",
    "build_hash" : "2f90bbf7b93631e52bafb59b3b049cb44ec25e96",
    "build_date" : "2019-10-28T20:40:44.881551Z",
    "build_snapshot" : false,
    "lucene_version" : "8.2.0",
    "minimum_wire_compatibility_version" : "6.8.0",
    "minimum_index_compatibility_version" : "6.0.0-beta1"
  },
  "tagline" : "You Know, for Search"
}

訪問Kibana： http://192.168.31.2:5601/app/kibana

初步檢索

_cat

GET/_cat/nodes：查看所有節點
GET/_cat/health：查看es健康狀況
GET/_cat/master：查看主節點
GET/_cat/indicies：查看所有索引，等價於mysql數據庫的show databases;

索引一個文檔

保存一個數據，保存在哪個索引的哪個類型下，指定用那個唯一標識
PUT customer/external/1;在customer索引下的external類型下保存1號數據爲

PUT customer/external/1
{
 "name":"John Doe"
}

PUT和POST都可以

POST新增。如果不指定id，會自動生成id。指定id就會修改這個數據，並新增版本號；
PUT可以新增也可以修改。PUT必須指定id；由於PUT需要指定id，一般用來做修改操作，不指定id會報錯。

{
    "_index": "customer",
    "_type": "external",
    "_id": "1",
    "_version": 1,
    "result": "created",
    "_shards": {
        "total": 2,
        "successful": 1,
        "failed": 0
    },
    "_seq_no": 0,
    "_primary_term": 1
}

“_index”: “customer” 表明該數據在哪個數據庫下；
“_type”: “external” 表明該數據在哪個類型下；
“_id”: “1” 表明被保存數據的id；
“_version”: 1, 被保存數據的版本
“result”: “created” 這裏是創建了一條數據，如果重新put一條數據，則該狀態會變爲updated，並且版本號也會發生變化。

下面選用POST方式：
添加數據的時候，不指定ID，會自動的生成id，並且類型是新增：

再次使用POST插入數據，仍然是新增的：

添加數據的時候，指定ID，會使用該id，並且類型是新增：

再次使用POST插入數據，類型爲updated

查看文檔

GET /customer/external/1
{
    "_index": "customer",//在哪個索引
    "_type": "external",//在哪個類型
    "_id": "1",//記錄id
    "_version": 3,//版本號
    "_seq_no": 6,//併發控制字段，每次更新都會+1，用來做樂觀鎖
    "_primary_term": 1,//同上，主分片重新分配，如重啓，就會變化
    "found": true,
    "_source": {
        "name": "John Doe"
    }
}

通過“if_seq_no=1&if_primary_term=1 ”，當序列號匹配的時候，才進行修改，否則不修改。

實例：將id=1的數據更新爲name=1，然後再次更新爲name=2，起始_seq_no=6，_primary_term=1

更新文檔

POST customer/external/1/_update
{
	"doc":{
		"name":"John"
	}
}

POST customer/external/1
{
	"name":"John"
}

PUT customer/external/1
{
	"name":"John"
}

第一種在更新的時候，會比較元數據，如果相同則不進行更新，版本號不變；後兩種即使內容相同也會更新版本號。

刪除索引或文檔

DELETE customer/external/1
DELETE customer

批量操作-bulk

{action:{metadata}}\n
{request body  }\n

{action:{metadata}}\n
{request body  }\n

這裏的批量操作，當發生某一條執行發生失敗時，其他的數據仍然能夠接着執行，也就是說彼此之間是獨立的。

bulk api以此按順序執行所有的action（動作）。如果一個單個的動作因任何原因失敗，它將繼續處理它後面剩餘的動作。當bulk api返回時，它將提供每個動作的狀態（與發送的順序相同），所以您可以檢查是否一個指定的動作是否失敗了。
實例1: 執行多條數據

POST customer/external/_bulk
{"index":{"_id":"1"}}
{"name":"John Doe"}
{"index":{"_id":"2"}}
{"name":"John Doe"}

#! Deprecation: [types removal] Specifying types in bulk requests is deprecated.
{
  "took" : 491,
  "errors" : false,
  "items" : [
    {
      "index" : {
        "_index" : "customer",
        "_type" : "external",
        "_id" : "1",
        "_version" : 1,
        "result" : "created",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 0,
        "_primary_term" : 1,
        "status" : 201
      }
    },
    {
      "index" : {
        "_index" : "customer",
        "_type" : "external",
        "_id" : "2",
        "_version" : 1,
        "result" : "created",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 1,
        "_primary_term" : 1,
        "status" : 201
      }
    }
  ]
}

實例2：對於整個索引執行批量操作

POST /_bulk
{"delete":{"_index":"website","_type":"blog","_id":"123"}}
{"create":{"_index":"website","_type":"blog","_id":"123"}}
{"title":"my first blog post"}
{"index":{"_index":"website","_type":"blog"}}
{"title":"my second blog post"}
{"update":{"_index":"website","_type":"blog","_id":"123"}}
{"doc":{"title":"my updated blog post"}}

檢索

https://www.elastic.co/guide/en/elasticsearch/reference/current/getting-started-search.html

Query DSL

基本語法格式

QUERY_NAME:{
   ARGUMENT:VALUE,
   ARGUMENT:VALUE,...
}

{
  QUERY_NAME:{
     FIELD_NAME:{
       ARGUMENT:VALUE,
       ARGUMENT:VALUE,...
      }   
   }
}

GET bank/_search
{
  "query": {
    "match_all": {}
  },
  "from": 0,
  "size": 5,
  "sort": [
    {
      "account_number": {
        "order": "desc"
      }
    }
  ]
}

match_all查詢類型【代表查詢所有的所有】，es中可以在query中組合非常多的查詢類型完成複雜查詢；
除了query參數之外，我們可也傳遞其他的參數以改變查詢結果，如sort，size；
from+size限定，完成分頁功能；
sort排序，多字段排序，會在前序字段相等時後續字段內部排序，否則以前序爲準；

返回部分字段

GET bank/_search
{
  "query": {
    "match_all": {}
  },
  "from": 0,
  "size": 5,
  "sort": [
    {
      "account_number": {
        "order": "desc"
      }
    }
  ],
  // 返回部分字段
  "_source": ["balance","firstname"]  
}

match匹配查詢

基本類型（非字符串），精確控制

GET bank/_search
{
  "query": {
    "match": {
      "account_number": "20"
    }
  }
}

字符串，全文檢索,最終會按照評分進行排序，會對檢索條件進行分詞匹配。

GET bank/_search
{
  "query": {
    "match": {
      "address": "kings"
    }
  }
}

match_phrase [短句匹配]

將需要匹配的值當成一整個單詞（不分詞）進行檢索

GET bank/_search
{
  "query": {
    "match_phrase": {
      "address": "mill road"
    }
  }
}

match keyword匹配的條件就是要顯示字段的全部值，要進行精確匹配

GET bank/_search
{
  "query": {
    "match": {
      "address.keyword": "990 Mill Road"
    }
  }
}

multi_math多字段匹配

GET bank/_search
{
  "query": {
    "multi_match": {
      "query": "mill",
      "fields": [
        "state",
        "address"
      ]
    }
  }
}

bool 複合查詢

複合語句可以合併，任何其他查詢語句，包括符合語句。這也就意味着，複合語句之間
可以互相嵌套，可以表達非常複雜的邏輯。

must：必須達到must所列舉的所有條件

GET bank/_search
{
   "query":{
        "bool":{
             "must":[
              {"match":{"address":"mill"}},
              {"match":{"gender":"M"}}
             ]
         }
    }
}

must_not，必須不匹配must_not所列舉的所有條件。

GET bank/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "gender": "M"
          }
        },
        {
          "match": {
            "address": "mill"
          }
        }
      ],
      "must_not": [
        {
          "match": {
            "age": "38"
          }
        }
      ]
    }
  }

should，應該達到should列舉的條件，如果到達會增加相關文檔的評分，並不會改變查詢的結果。如果query中只有should且只有一種匹配規則，那麼should的條件就會被作爲默認匹配條件二區改變查詢結果.

GET bank/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "gender": "M"
          }
        },
        {
          "match": {
            "address": "mill"
          }
        }
      ],
      "must_not": [
        {
          "match": {
            "age": "18"
          }
        }
      ],
      "should": [
        {
          "match": {
            "lastname": "Wallace"
          }
        }
      ]
    }
  }
}

Filter 結果過濾

並不是所有的查詢都需要產生分數，特別是哪些僅用於filtering過濾的文檔。爲了不計算分數，elasticsearch會自動檢查場景並且優化查詢的執行。

查詢所有匹配address=mill的文檔，然後再根據10000<=balance<=20000進行過濾查詢結果

GET bank/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "address": "mill"
          }
        }
      ],
      "filter": {
        "range": {
          "balance": {
            "gte": "10000",
            "lte": "20000"
          }
        }
      }
    }
  }
}

term

和match一樣。匹配某個屬性的值。全文檢索字段用match，其他非text字段匹配用term。

Aggregation 執行聚合

聚合提供了從數據中分組和提取數據的能力。最簡單的聚合方法大致等於SQL Group by和SQL聚合函數。在elasticsearch中，執行搜索返回this（命中結果），並且同時返回聚合結果，把以響應中的所有hits（命中結果）分隔開的能力。這是非常強大且有效的，你可以執行查詢和多個聚合，並且在一次使用中得到各自的（任何一個的）返回結果，使用一次簡潔和簡化的API避免網絡往返。

"aggs":{
    "aggs_name這次聚合的名字，方便展示在結果集中":{
        "AGG_TYPE聚合的類型(avg,term,terms)":{}
     }
}

搜索address中包含mill的所有人的年齡分佈以及平均年齡，但不顯示這些人的詳情

GET bank/_search
{
  "query": {
    "match": {
      "address": "Mill"
    }
  },
  "aggs": {
    "ageAgg": {
      "terms": {
        "field": "age",
        "size": 10
      }
    },
    "ageAvg": {
      "avg": {
        "field": "age"
      }
    },
    "balanceAvg": {
      "avg": {
        "field": "balance"
      }
    }
  },
  "size": 0
}

按照年齡聚合，並且求這些年齡段的這些人的平均薪資:

GET bank/_search
{
  "query": {
    "match_all": {}
  },
  "aggs": {
    "ageAgg": {
      "terms": {
        "field": "age",
        "size": 100
      },
      "aggs": {
        "ageBalanceAvg": {
          "avg": {
            "field": "balance"
          }
        }
      }
    }
  },
  "size": 0
}

查出所有年齡分佈，並且這些年齡段中M的平均薪資和F的平均薪資以及這個年齡段的總體平均薪資:

GET bank/_search
{
  "query": {
    "match_all": {}
  },
  "aggs": {
    "ageAgg": {
      "terms": {
        "field": "age",
        "size": 100
      },
      "aggs": {
        "genderAgg": {
          "terms": {
            "field": "gender.keyword"
          },
          "aggs": {
            "balanceAvg": {
              "avg": {
                "field": "balance"
              }
            }
          }
        },
        "ageBalanceAvg": {
          "avg": {
            "field": "balance"
          }
        }
      }
    }
  },
  "size": 0
}

Mapping

字段類型

映射

Maping是用來定義一個文檔（document），以及它所包含的屬性（field）是如何存儲和索引的。比如：使用maping來定義：

哪些字符串屬性應該被看做全文本屬性（full text fields）；
哪些屬性包含數字，日期或地理位置；
文檔中的所有屬性是否都嫩被索引（all 配置）；
日期的格式；
自定義映射規則來執行動態添加屬性；

查看mapping信息

GET bank/_mapping

{
  "bank" : {
    "mappings" : {
      "properties" : {
        "account_number" : {
          "type" : "long"
        },
        "address" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "age" : {
          "type" : "long"
        },
        "balance" : {
          "type" : "long"
        },
        "city" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "email" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "employer" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "firstname" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "gender" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "lastname" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "state" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        }
      }
    }
  }
}

新版本改變

ElasticSearch7-去掉type概念

關係型數據庫中兩個數據表示是獨立的，即使他們裏面有相同名稱的列也不影響使用，但ES中不是這樣的。elasticsearch是基於Lucene開發的搜索引擎，而ES中不同type下名稱相同的filed最終在Lucene中的處理方式是一樣的。
- 兩個不同type下的兩個user_name，在ES同一個索引下其實被認爲是同一個filed，你必須在兩個不同的type中定義相同的filed映射。否則，不同type中的相同字段名稱就會在處理中出現衝突的情況，導致Lucene處理效率下降。
- 去掉type就是爲了提高ES處理數據的效率。
Elasticsearch 7.x URL中的type參數爲可選。比如，索引一個文檔不再要求提供文檔類型。
Elasticsearch 8.x 不再支持URL中的type參數。
解決：
將索引從多類型遷移到單類型，每種類型文檔一個獨立索引
將已存在的索引下的類型數據，全部遷移到指定位置即可。詳見數據遷移

創建索引並指定映射

PUT /my_index
{
  "mappings": {
    "properties": {
      "age": {
        "type": "integer"
      },
      "email": {
        "type": "keyword"
      },
      "name": {
        "type": "text"
      }
    }
  }
}

查看映射

GET /my_index
{
  "my_index" : {
    "aliases" : { },
    "mappings" : {
      "properties" : {
        "age" : {
          "type" : "integer"
        },
        "email" : {
          "type" : "keyword"
        },
        "employee-id" : {
          "type" : "keyword",
          "index" : false
        },
        "name" : {
          "type" : "text"
        }
      }
    },
    "settings" : {
      "index" : {
        "creation_date" : "1588410780774",
        "number_of_shards" : "1",
        "number_of_replicas" : "1",
        "uuid" : "ua0lXhtkQCOmn7Kh3iUu0w",
        "version" : {
          "created" : "7060299"
        },
        "provided_name" : "my_index"
      }
    }
  }
}

添加新的字段映射

PUT /my_index/_mapping
{
  "properties": {
    "employee-id": {
      "type": "keyword",
      // 這裏的 "index": false，表明新增的字段不能被檢索，只是一個冗餘字段。
      "index": false
    }
  }
}

更新映射

對於已經存在的字段映射，我們不能更新。更新必須創建新的索引，進行數據遷移。

數據遷移

先創建new_twitter的正確映射。然後使用如下方式進行數據遷移。

POST reindex [固定寫法]
{
  "source":{
      "index":"twitter"
   },
  "dest":{
      "index":"new_twitters"
   }
}

將舊索引的type下的數據進行遷移

POST reindex [固定寫法]
{
  "source":{
      "index":"twitter",
      "twitter":"twitter"
   },
  "dest":{
      "index":"new_twitters"
   }
}

https://www.elastic.co/guide/en/elasticsearch/reference/7.6/docs-reindex.html

分詞

一個tokenizer（分詞器）接收一個字符流，將之分割爲獨立的tokens（詞元，通常是獨立的單詞），然後輸出tokens流。

例如：whitespace tokenizer遇到空白字符時分割文本。它會將文本“Quick brown fox!”分割爲[Quick,brown,fox!]。

該tokenizer（分詞器）還負責記錄各個terms(詞條)的順序或position位置（用於phrase短語和word proximity詞近鄰查詢），以及term（詞條）所代表的原始word（單詞）的start（起始）和end（結束）的character offsets（字符串偏移量）（用於高亮顯示搜索的內容）。

elasticsearch提供了很多內置的分詞器，可以用來構建custom analyzers（自定義分詞器）。

POST _analyze
{
  "analyzer": "standard",
  "text": "The 2 QUICK Brown-Foxes jumped over the lazy dog's bone."
}

安裝ik分詞器

所有的語言分詞，默認使用的都是“Standard Analyzer”，但是這些分詞器針對於中文的分詞，並不友好。爲此需要安裝中文的分詞器。

注意：不能用默認elasticsearch-plugin install xxx.zip 進行自動安裝
https://github.com/medcl/elasticsearch-analysis-ik/releases/download 對應es版本安裝

在前面安裝的elasticsearch時，我們已經將elasticsearch容器的“/usr/share/elasticsearch/plugins”目錄，映射到宿主機的“ /mydata/elasticsearch/plugins”目錄下，所以比較方便的做法就是下載“/elasticsearch-analysis-ik-7.4.2.zip”文件，然後解壓到該文件夾下即可。安裝完畢後，需要重啓elasticsearch容器。

測試ik分詞

GET my_index/_analyze
{
   "analyzer": "ik_smart", 
   "text":"我是中國人"
}

{
  "tokens" : [
    {
      "token" : "我",
      "start_offset" : 0,
      "end_offset" : 1,
      "type" : "CN_CHAR",
      "position" : 0
    },
    {
      "token" : "是",
      "start_offset" : 1,
      "end_offset" : 2,
      "type" : "CN_CHAR",
      "position" : 1
    },
    {
      "token" : "中國人",
      "start_offset" : 2,
      "end_offset" : 5,
      "type" : "CN_WORD",
      "position" : 2
    }
  ]
}

GET my_index/_analyze
{
   "analyzer": "ik_max_word", 
   "text":"我是中國人"
}

{
  "tokens" : [
    {
      "token" : "我",
      "start_offset" : 0,
      "end_offset" : 1,
      "type" : "CN_CHAR",
      "position" : 0
    },
    {
      "token" : "是",
      "start_offset" : 1,
      "end_offset" : 2,
      "type" : "CN_CHAR",
      "position" : 1
    },
    {
      "token" : "中國人",
      "start_offset" : 2,
      "end_offset" : 5,
      "type" : "CN_WORD",
      "position" : 2
    },
    {
      "token" : "中國",
      "start_offset" : 2,
      "end_offset" : 4,
      "type" : "CN_WORD",
      "position" : 3
    },
    {
      "token" : "國人",
      "start_offset" : 3,
      "end_offset" : 5,
      "type" : "CN_WORD",
      "position" : 4
    }
  ]
}

自定義詞庫

修改/usr/share/elasticsearch/plugins/ik/config中的IKAnalyzer.cfg.xml

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">
<properties>
	<comment>IK Analyzer 擴展配置</comment>
	<!--用戶可以在這裏配置自己的擴展字典 -->
	<entry key="ext_dict"></entry>
	 <!--用戶可以在這裏配置自己的擴展停止詞字典-->
	<entry key="ext_stopwords"></entry>
	<!--用戶可以在這裏配置遠程擴展字典 -->
	<entry key="remote_ext_dict">http://192.168.31.2/es/fenci.txt</entry> 
	<!--用戶可以在這裏配置遠程擴展停止詞字典-->
	<!-- <entry key="remote_ext_stopwords">words_location</entry> -->
</properties>

修改完成後，需要重啓elasticsearch容器，否則修改不生效。

更新完成後，es只會對於新增的數據用更新分詞。歷史數據是不會重新分詞的。如果想要歷史數據重新分詞，需要執行：

POST my_index/_update_by_query?conflicts=proceed

http://192.168.31.2/es/fenci.txt，這個是nginx上資源的訪問路徑

SpringBoot整合ElasticSearch

導入依賴

版本要和所按照的ELK版本匹配

<dependency>
    <groupId>org.elasticsearch.client</groupId>
    <artifactId>elasticsearch-rest-high-level-client</artifactId>
    <version>7.4.2</version>
</dependency>

在spring-boot-dependencies中所依賴的ELK版本6.8.3，需要在項目中將它改爲7.4.2

<properties>
    ...
    <elasticsearch.version>7.4.2</elasticsearch.version>
</properties>

編寫配置類

@Configuration
public class ElasticSearchConfig {

    /**
     * 單實例通用設置
     */
    public static final RequestOptions COMMON_OPTIONS;
    static {
        RequestOptions.Builder builder = RequestOptions.DEFAULT.toBuilder();
//        builder.addHeader("Authorization", "Bearer " + TOKEN);
//        builder.setHttpAsyncResponseConsumerFactory(
//                new HttpAsyncResponseConsumerFactory
//                        .HeapBufferedResponseConsumerFactory(30 * 1024 * 1024 * 1024));
        COMMON_OPTIONS = builder.build();
    }

	@Bean
    public RestHighLevelClient esestClient() {
        RestHighLevelClient client = new RestHighLevelClient(
                RestClient.builder(
                        new HttpHost("192.168.31.2", 9200, "http")));
        return client;
    }
}

編寫測試類

測試索引數據

@Test
public void indexData() throws IOException {
    IndexRequest indexRequest = new IndexRequest ("users");

    User user = new User();
    user.setUserName("張三");
    user.setAge(20);
    user.setGender("男");
    String jsonString = JSON.toJSONString(user);
    //設置要保存的內容
    indexRequest.source(jsonString, XContentType.JSON);
    //執行創建索引和保存數據
    IndexResponse index = client.index(indexRequest, ElasticSearchConfig.COMMON_OPTIONS);

    System.out.println(index);

}

測試檢索數據

/**
 * 複雜檢索:在bank中搜索address中包含mill的所有人的年齡分佈以及平均年齡，平均薪資
 * @throws IOException
 */
@Test
public void searchData() throws IOException {
    //1. 創建檢索請求
    SearchRequest searchRequest = new SearchRequest();

    //1.1）指定索引
    searchRequest.indices("bank");
    //1.2）構造檢索條件
    SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
    sourceBuilder.query(QueryBuilders.matchQuery("address","Mill"));

    //1.2.1)按照年齡分佈進行聚合
    TermsAggregationBuilder ageAgg=AggregationBuilders.terms("ageAgg").field("age").size(10);
    sourceBuilder.aggregation(ageAgg);

    //1.2.2)計算平均年齡
    AvgAggregationBuilder ageAvg = AggregationBuilders.avg("ageAvg").field("age");
    sourceBuilder.aggregation(ageAvg);
    //1.2.3)計算平均薪資
    AvgAggregationBuilder balanceAvg = AggregationBuilders.avg("balanceAvg").field("balance");
    sourceBuilder.aggregation(balanceAvg);

    System.out.println("檢索條件："+sourceBuilder);
    searchRequest.source(sourceBuilder);
    //2. 執行檢索
    SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);
    System.out.println("檢索結果："+searchResponse);

    //3. 將檢索結果封裝爲Bean
    SearchHits hits = searchResponse.getHits();
    SearchHit[] searchHits = hits.getHits();
    for (SearchHit searchHit : searchHits) {
        String sourceAsString = searchHit.getSourceAsString();
        Account account = JSON.parseObject(sourceAsString, Account.class);
        System.out.println(account);

    }

    //4. 獲取聚合信息
    Aggregations aggregations = searchResponse.getAggregations();
    Terms ageAgg1 = aggregations.get("ageAgg");

    for (Terms.Bucket bucket : ageAgg1.getBuckets()) {
        String keyAsString = bucket.getKeyAsString();
        System.out.println("年齡："+keyAsString+" ==> "+bucket.getDocCount());
    }
    Avg ageAvg1 = aggregations.get("ageAvg");
    System.out.println("平均年齡："+ageAvg1.getValue());

    Avg balanceAvg1 = aggregations.get("balanceAvg");
    System.out.println("平均薪資："+balanceAvg1.getValue());
}  

@Data
@ToString
static class Accout {

	private int account_number;
	private int balance;
	private String firstname;
	private String lastname;
	private int age;
	private String gender;
	private String address;
	private String employer;
	private String email;
	private String city;
	private String state;
}

ElasticSearch_入門