ElasticSearch是一個基於Lucene的搜索服務器,它提供了一個分佈式多用戶能力的全文搜索引擎,基於RESTful web接口。Elasticsearch是用Java語言開發的,並作爲Apache許可條款下的開放源碼發佈,是一種流行的企業級搜索引擎。ElasticSearch用於雲計算中,能夠達到實時搜索,穩定,可靠,快速,安裝使用方便。
1.Elasticsearch中的文檔表現
ElasticSearch是面向文檔(document oriented)的,這意味着它可以存儲整個對象或文檔(document)。然而它不僅僅是存儲,還會索引(index)每個文檔的內容使之可以被快速搜索。在ElasticSearch中,你可以對文檔(而非成行成列的數據)進行索引、搜索、排序、過濾,集合及數據分析。
ElasticSearch使用 JSON作爲文檔序列化格式。JSON現在已經被大多語言所支持,而且已經成爲NoSQL數據領域的標準格式。
ElasticSearch的一個文檔不僅包含文檔信息,還包含元數據--有關文檔的信息。元數據的三大元素分別是:
_index:索引庫,類似於關係型數據庫裏的“數據庫”,它是我們存儲和索引關聯數據的地方。
_type:類型,類似於關係型數據庫中的表。可以是大寫或小寫,不能包含下劃線或逗號。
_id:與_index和_type組合時,就可以在ELasticsearch中唯一標識(類似於主鍵)一個文檔。當創建一個文檔,你可以自定義_id,也可以讓Elasticsearch自動生成。
另外,元數據還包括以下信息:
_uid:文檔唯一標識(_type#_id)
_source:文檔原始數據
_all:所有字段的連接字符串
2.Elasticsearch中的服務URL
ElasticSearch中常用的的各種服務的URL地址,如下表所示:
功能 |
URL |
請求方式 |
說明 |
集羣相關 |
/_cat/health?v |
GET |
查看集羣健康狀態 |
/_cat/nodes?v |
GET |
查看節點健康狀態 |
|
/_cat/indices?v |
GET |
查看集羣所有索引 |
|
/_cluster/nodes |
GET |
獲得集羣中所有節點和信息 |
|
/_cluster/health |
GET |
查看集羣健康狀態 |
|
/_cluster/state |
GET |
獲得集羣裏的所有信息(集羣信息、節點信息、mapping信息等) |
|
節點相關 |
/_nodes/process |
GET |
查看file descriptor的相關信息 |
/_nodes/process/stats |
GET |
統計節點的資源信息(內存、CPU等) |
|
/_nodes/jvm |
GET |
獲得各節點的虛擬機統計和配置信息 |
|
/_nodes/jvm/stats |
GET |
更加詳細的虛擬機信息 |
|
/_nodes/http |
GET |
獲得各個節點的http信息(如ip地址) |
|
/_nodes/http/stats |
GET |
獲得各個節點處理http請求的統計情況 |
|
/_nodes/thread_pool |
GET |
獲得各種類型的線程池 |
|
/_nodes/thread_pool/stats |
GET |
獲得各種類型的線程池的統計信息 |
|
索引相關 |
/index/_search |
GET,POST |
索引查詢 |
/index |
PUT,DELETE |
創建或操作索引 |
|
/_aliases |
GET,POST |
獲取或操作索引的別名 |
|
/index/_settings |
PUT |
創建或操作設置(其中number_of_shards不可更改) |
|
/index/_mapping |
PUT |
創建或操作mapping |
|
/index/_open |
POST |
打開被關閉的索引 |
|
/index/_close |
POST |
關閉索引 |
|
/index/_refresh |
POST |
刷新索引(使新加內容對搜索可見) |
|
/index/_flush |
POST |
刷新索引,將變動提交到lucene索引文件中並清空elasticsearch的transaction log |
|
/index/_optimize |
POST |
優化segement,主要是對索引的segement進行合併 |
|
/index/_status |
GET |
獲得索引的狀態信息 |
|
/index/_segments |
GET |
獲得索引的segments的狀態信息 |
|
/index/type/id |
PUT,POST,DELETE |
操作指定文檔(增刪改查) |
|
/index/type/id/_create |
PUT |
創建一個文檔,如果該文件已經存在,則返回失敗 |
|
/index/type/id/_update |
POST |
更新一個文件,如果改文件不存在,則返回失敗 |
|
/index/type/_bulk |
PUT |
批量提交數據更新 |
|
/index/type/_mget |
GTE |
批量獲取指定_id的文檔信息 |
|
/index/_explain |
GET |
不執行實際搜索,而返回解釋信息 |
|
/index/_analyze |
GET |
不執行實際搜索,根據輸入的參數進行文本分析 |
3.ElasticSearch的URL操作
3.1 查看集羣信息
3.1.1 查看集羣健康狀態
GET _cat/health?v
Response:
epoch timestamp cluster status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent
1565253576 08:39:36 my-es.cluster green 1 1 2 2 0 0 0 0 - 100.0%
3.1.2 查看節點健康狀態
GET _cat/nodes?v
Response:
ip heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
192.168.1.199 36 63 1 0.00 0.14 0.12 mdi * node-1
3.1.3 查看集羣所有索引
GET _cat/indices?v
Response:
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
green open .kibana_task_manager aP_xUt7lQD2RdQDuT5ynbw 1 0 2 0 12.5kb 12.5kb
green open .kibana_1 -axbsiTwRPmlIVniX-0hOA 1 0 4 1 19.8kb 19.8kb
3.1.4 查看集羣健康狀態
GET _cluster/health
Response:
{
"cluster_name" : "my-es.cluster",
"status" : "green",
"timed_out" : false,
"number_of_nodes" : 1,
"number_of_data_nodes" : 1,
"active_primary_shards" : 2,
"active_shards" : 2,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_number" : 100.0
}
3.2 查看節點信息
3.2.1 查看file descriptor的相關信息
GET _nodes/process
Response:
{
"_nodes" : {
"total" : 1,
"successful" : 1,
"failed" : 0
},
"cluster_name" : "my-es.cluster",
"nodes" : {
"SQYgJvIZR7yqA3TzkURejA" : {
"name" : "node-1",
"transport_address" : "192.168.1.199:9300",
"host" : "192.168.1.199",
"ip" : "192.168.1.199",
"version" : "6.8.2",
"build_flavor" : "default",
"build_type" : "tar",
"build_hash" : "b506955",
"roles" : [
"master",
"data",
"ingest"
],
"attributes" : {
"ml.machine_memory" : "3954188288",
"xpack.installed" : "true",
"ml.max_open_jobs" : "20",
"ml.enabled" : "true"
},
"process" : {
"refresh_interval_in_millis" : 1000,
"id" : 9496,
"mlockall" : false
}
}
}
}
3.2.2 獲得各個節點的http信息
GET _nodes/http
Response:
{
"_nodes" : {
"total" : 1,
"successful" : 1,
"failed" : 0
},
"cluster_name" : "my-es.cluster",
"nodes" : {
"SQYgJvIZR7yqA3TzkURejA" : {
"name" : "node-1",
"transport_address" : "192.168.1.199:9300",
"host" : "192.168.1.199",
"ip" : "192.168.1.199",
"version" : "6.8.2",
"build_flavor" : "default",
"build_type" : "tar",
"build_hash" : "b506955",
"roles" : [
"master",
"data",
"ingest"
],
"attributes" : {
"ml.machine_memory" : "3954188288",
"xpack.installed" : "true",
"ml.max_open_jobs" : "20",
"ml.enabled" : "true"
},
"http" : {
"bound_address" : [
"192.168.1.199:9200"
],
"publish_address" : "192.168.1.199:9200",
"max_content_length_in_bytes" : 104857600
}
}
}
}
3.3 索引的相關操作
3.3.1創建一個索引,並設置shards和replicas的個數
PUT user_index
{
"settings": {
"number_of_replicas": 1,
"number_of_shards": 1
}
}
Response:
{
"acknowledged" : true,
"shards_acknowledged" : true,
"index" : "user_index"
}
3.3.2 修改索引的replicas數,shards是不能修改
PUT user_index/_settings
{
"settings": {
"number_of_replicas": 2
}
}
Response:
{
"acknowledged" : true
}
3.3.3 刪除索引
DELETE user_index
Response:
{
"acknowledged" : true
}
3.3.4 添加索引關聯別名
POST _aliases
{
"actions":[{
"add":{"index":"user_index","alias":"user_alias"}
}]
}
也可以這樣寫:
PUT user_index/_aliases
{
"actions":[{
"add":{"alias":"user_alias"}
}]
}
3.3.5 刪除索引關聯別名
POST _aliases
{
"actions":[{
"remove":{"index":"user_index","alias":"user_alias"}
}]
}
也可以這樣寫:
PUT user_index/_aliases
{
"actions":[{
"remove":{"alias":"user_alias"}
}]
}
3.3.6查看索引別名信息
GET _aliases
Response:
{
".kibana_1" : {
"aliases" : {
".kibana" : { }
}
},
".kibana_task_manager" : {
"aliases" : { }
},
"user_index" : {
"aliases" : {
"user_alias" : { }
}
}
}
3.3.7 創建索引mapping
PUT user_index/_mapping/user_type
{
"dynamic":false,
"properties": {
"name":{
"type": "text",
"analyzer": "standard"
},
"age": {
"type": "integer"
},
"join_date":{
"type": "date"
},
"phone":{
"type": "keyword"
},
"country":{
"type": "keyword"
},
"province":{
"type": "keyword"
},
"city":{
"type": "keyword"
},
"remark":{
"type": "text",
"analyzer": "whitespace"
}
}
}
3.3.8 添加一個doc文檔,指定doc的_id。
如果沒有指定_id則Elasticsearch會自動創建一個_id的值
PUT user_index/user_type/1
{
"name":"chen zhuangyuan",
"age":27,
"join_date":"2018-01-01",
"phone":"18823450001",
"country":"CN",
"province":"guangdong",
"city":"guangzhou",
"remark":"I'm zhuangyuan,I like elasticsearch"
}
Response:
{
"_index" : "user_index",
"_type" : "user_type",
"_id" : "2",
"_version" : 1,
"result" : "created",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 20,
"_primary_term" : 3
}
3.3.9 更新一個doc文檔的值,完全替換更新。
這個和新增一個doc一樣,如果doc存在則完全更新,doc不存在則創建。
PUT user_index/user_type/1
{
"name":"chen zhuangyuan",
"age":28
}
更新後_id=1的這個doc的信息如下,其他字段的值已經被清空了。
GET user_index/user_type/1
Response:
{
"_index" : "user_index",
"_type" : "user_type",
"_id" : "1",
"_version" : 8,
"_seq_no" : 23,
"_primary_term" : 3,
"found" : true,
"_source" : {
"name" : "chen zhuangyuan",
"age" : 28
}
}
3.3.10 創建一個doc文檔,當且僅當文檔不存在時創建,存在是返回錯誤。
PUT user_index/user_type/3/_create
{
"name":"zhang fulai",
"age":28,
"join_date":"2018-03-01",
"phone":"18823450003",
"country":"CN",
"province":"guangdong",
"city":"shenzhen",
"remark":"I'm liaiguo,I like hadoop"
}
Response:
{
"_index" : "user_index",
"_type" : "user_type",
"_id" : "3",
"_version" : 1,
"result" : "created",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 25,
"_primary_term" : 3
}
再次執行,則返回錯誤,創建失敗。
{
"error": {
"root_cause": [
{
"type": "version_conflict_engine_exception",
"reason": "[user_type][3]: version conflict, document already exists (current version [1])",
"index_uuid": "FT3HUBPESD6Yih2o_EddLw",
"shard": "2",
"index": "user_index"
}
],
"type": "version_conflict_engine_exception",
"reason": "[user_type][3]: version conflict, document already exists (current version [1])",
"index_uuid": "FT3HUBPESD6Yih2o_EddLw",
"shard": "2",
"index": "user_index"
},
"status": 409
}
3.3.11 更新一個doc文檔的指定字段的值。
如將_id=3的這個用戶age修改爲29。
POST user_index/user_type/3/_update
{
"doc": {
"age":29
}
}
更新後_id=3的這個doc的信息如下:
GET user_index/user_type/3
Response:
{
"_index" : "user_index",
"_type" : "user_type",
"_id" : "3",
"_version" : 2,
"_seq_no" : 26,
"_primary_term" : 3,
"found" : true,
"_source" : {
"name" : "zhang fulai",
"age" : 29,
"join_date" : "2018-03-01",
"phone" : "18823450003",
"country" : "CN",
"province" : "guangdong",
"city" : "shenzhen",
"remark" : "I'm liaiguo,I like hadoop"
}
}
3.3.12 批量提交_bulk。
一次提交增、刪、改的文檔信息,這種操作的效率減少了請求服務器的網絡次數,提高了執行的效率。
PUT user_index/user_type/_bulk
{"index":{"_id":"4"}}
{"name":"guo daming","age":26,"phone":"18823450004","country":"CN","province":"beijing","city":"beijingshi","remark":"I.m from beijing,I like java"}
{"index":{"_id":"5"}}
{"name":"zhao mingming","age":26,"phone":"18823450005","country":"CN","province":"shanghai","city":"shanghaishi","remark":"I.m from shanghai,I like spark"}
{"delete":{"_id":"1"}}
{"update":{"_id":"2"}}
{"doc":{"age":"25"}}
Response:
{
"took" : 19,
"errors" : false,
"items" : [
{
"index" : {
"_index" : "user_index",
"_type" : "user_type",
"_id" : "4",
"_version" : 7,
"result" : "updated",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 33,
"_primary_term" : 3,
"status" : 200
}
},
{
"index" : {
"_index" : "user_index",
"_type" : "user_type",
"_id" : "5",
"_version" : 1,
"result" : "created",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 34,
"_primary_term" : 3,
"status" : 201
}
},
{
"delete" : {
"_index" : "user_index",
"_type" : "user_type",
"_id" : "1",
"_version" : 10,
"result" : "deleted",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 27,
"_primary_term" : 3,
"status" : 200
}
},
{
"update" : {
"_index" : "user_index",
"_type" : "user_type",
"_id" : "2",
"_version" : 8,
"result" : "updated",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 35,
"_primary_term" : 3,
"status" : 200
}
}
]
}
另外,如果一次提交的_bulk的參數不在同一個index下,在每一個參數體裏面指定index和type就可以。
PUT _bulk
{"create":{"_index":"user_index","_type":"user_type","_id":"4"}}
{"name":"guo daming","age":26,"phone":"18823450004","country":"CN","province":"beijing","city":"beijingshi","remark":"I.m from beijing,I like java"}
{"create":{"_index":"user_index","_type":"user_type","_id":"5"}}
{"name":"zhao mingming","age":26,"phone":"18823450005","country":"CN","province":"shanghai","city":"shanghaishi","remark":"I.m from shanghai,I like spark"}
{"delete":{"_index":"user_index","_type":"user_type","_id":"1"}}
{"update":{"_index":"user_index","_type":"user_type","_id":"2"}}
{"doc":{"age":"25"}}
3.3.13 Elasticsearch的文檔查詢
3.3.13.1 根據文檔_id獲取。
URL地址格式:index/type/_id
GET user_index/user_type/1
3.3.13.2 批量查詢_mget。
URL地址格式:index/type/_mget。API參數是一個docs數組,數組的每個節點定義一個文檔的_index、_type、_id元數據。如果你只想檢索一個或幾個確定的字段,也可以定義一個_source。
GET _mget
{
"docs":[{
"_index":"user_index",
"_type":"user_type",
"_id":"1",
"_source":["name","phone"]
},
{
"_index":"user_index",
"_type":"user_type",
"_id":"1",
"_source":["name","phone"]
}]
}
Response:
{
"docs" : [
{
"_index" : "user_index",
"_type" : "user_type",
"_id" : "1",
"_version" : 1,
"_seq_no" : 29,
"_primary_term" : 3,
"found" : true,
"_source" : {
"phone" : "18823450001",
"name" : "chen zhuangyuan"
}
},
{
"_index" : "user_index",
"_type" : "user_type",
"_id" : "1",
"_version" : 1,
"_seq_no" : 29,
"_primary_term" : 3,
"found" : true,
"_source" : {
"phone" : "18823450001",
"name" : "chen zhuangyuan"
}
}
]
}
另外,也可以使用簡單的參數查詢,通過數組指定文檔的_id。
GET user_index/user_type/_mget
{
"ids":["1","2","3","4"]
}
3.3.13.3 空查詢,即查詢所有。
如果沒有指定查詢參數,則查詢索引下的所有文檔信息。
GET user_index/user_type/_search
GET user_index/_search
GET _search
GET user_index/user_type/_search
{
"query": {
"match_all": {}
}
}
3.3.13.4 查詢字符串搜索。
如搜索索引中包含elasticsearch的所有文檔信息。
GET user_index/user_type/_search?q=elasticsearch
因爲用戶信息中只有remark字段包含了elasticsearch,因此這個查詢等價於:
GET user_index/user_type/_search?q=remark:elasticsearch
Response:
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 3,
"successful" : 3,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 1,
"max_score" : 0.6931472,
"hits" : [
{
"_index" : "user_index",
"_type" : "user_type",
"_id" : "1",
"_score" : 0.6931472,
"_source" : {
"name" : "chen zhuangyuan",
"age" : 27,
"join_date" : "2018-01-01",
"phone" : "18823450001",
"country" : "CN",
"province" : "guangdong",
"city" : "guangzhou",
"remark" : "I'm zhuangyuan,I like elasticsearch"
}
}
]
}
}
3.3.13.5 請求參數體搜索。
如搜索索引中包含elasticsearch的所有文檔信息。
GET user_index/user_type/_search
{
"query": {
"term": {
"remark": "elasticsearch"
}
}
}
GET user_index/user_type/_search
{
"query": {
"terms": {
"remark": [
"hadoop",
"spark"
]
}
}
}
3.3.13.6 分頁查詢From/Size。
通過from和size參數,可以實現分頁查詢。from表示從第幾條開始取,size 表示最多取多少條。from默認值是0,size默認值是10。
GET user_index/user_type/_search
{
"query": {
"match": {
"remark":"spark"
}
},
"from": 0,
"size": 1
}
Response:
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 3,
"successful" : 3,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 2,
"max_score" : 0.49917623,
"hits" : [
{
"_index" : "user_index",
"_type" : "user_type",
"_id" : "2",
"_score" : 0.49917623,
"_source" : {
"name" : "li aiguo",
"age" : "25",
"join_date" : "2018-02-01",
"phone" : "18823450002",
"country" : "CN",
"province" : "guangdong",
"city" : "shenzhen",
"remark" : "I'm liaiguo,I like spark"
}
}
]
}
}
3.3.13.7 Sort排序。
實現按照指定一個或多個字段進行排序。默認請求下,搜索結果會按照_score的得分進行排序。
GET user_index/user_type/_search
{
"query": {
"match": {
"remark":"spark"
}
},
"sort": [
{
"age": {
"order": "asc"
},
"province": {
"order": "asc"
}
}
]
}
Response:
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 3,
"successful" : 3,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 2,
"max_score" : null,
"hits" : [
{
"_index" : "user_index",
"_type" : "user_type",
"_id" : "2",
"_score" : null,
"_source" : {
"name" : "li aiguo",
"age" : "25",
"join_date" : "2018-02-01",
"phone" : "18823450002",
"country" : "CN",
"province" : "guangdong",
"city" : "shenzhen",
"remark" : "I'm liaiguo,I like spark"
},
"sort" : [
25,
"guangdong"
]
},
{
"_index" : "user_index",
"_type" : "user_type",
"_id" : "5",
"_score" : null,
"_source" : {
"name" : "zhao mingming",
"age" : 26,
"phone" : "18823450005",
"country" : "CN",
"province" : "shanghai",
"city" : "shanghaishi",
"remark" : "I.m from shanghai,I like spark"
},
"sort" : [
26,
"shanghai"
]
}
]
}
}
3.3.13.8 範圍查詢。
如搜索用戶索引中age大於等於27且小於等於30的所有用戶信息,並且結果按照年齡升序排序。
GET user_index/user_type/_search
{
"query": {
"range": {
"age": {
"gte": 27,
"lte": 30
}
}
}
, "sort": [
{
"age": {
"order": "asc"
}
}
]
}
Response:
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 3,
"successful" : 3,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 2,
"max_score" : null,
"hits" : [
{
"_index" : "user_index",
"_type" : "user_type",
"_id" : "1",
"_score" : null,
"_source" : {
"name" : "chen zhuangyuan",
"age" : 27,
"join_date" : "2018-01-01",
"phone" : "18823450001",
"country" : "CN",
"province" : "guangdong",
"city" : "guangzhou",
"remark" : "I'm zhuangyuan,I like elasticsearch"
},
"sort" : [
27
]
},
{
"_index" : "user_index",
"_type" : "user_type",
"_id" : "3",
"_score" : null,
"_source" : {
"name" : "zhang fulai",
"age" : 29,
"join_date" : "2018-03-01",
"phone" : "18823450003",
"country" : "CN",
"province" : "guangdong",
"city" : "shenzhen",
"remark" : "I'm liaiguo,I like hadoop"
},
"sort" : [
29
]
}
]
}
}
3.3.13.9 查看索引中的所有文檔總數。
GET user_index/user_type/_count
Response:
{
"count" : 5,
"_shards" : {
"total" : 3,
"successful" : 3,
"skipped" : 0,
"failed" : 0
}
}
3.13.19 組合多條件查詢
在項目的實際開發中,基本都是組合多條件查詢來滿足實際的需求。elasticsearch提供bool來實現這種需求。主要參數:
must:文檔必須匹配這些條件才能被包含進來。
must_not:文檔必須不匹配這些條件才能被包含進來。
should:如果滿足這些語句中的任意語句將增加_score得分 ,否則無任何影響。它們主要用於修正每個文檔的相關性得分。
filter:必須匹配,但它以不評分、過濾模式來進行。這些語句對評分沒有貢獻,只是根據過濾標準來排除或包含文檔。
例如:查詢用戶信息中,remark必須包含elasticsearch,並且不包含spark的用戶信息。
GET user_index/user_type/_search
{
"query": {
"bool": {
"must": {
"match": {
"remark": "elasticsearch"
}
},
"must_not": {
"match":{
"remark":"spark"
}
},
"should": {
"match":{
"age":27
}
}
}
}
}
Response:
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 3,
"successful" : 3,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 1,
"max_score" : 1.6931472,
"hits" : [
{
"_index" : "user_index",
"_type" : "user_type",
"_id" : "1",
"_score" : 1.6931472,
"_source" : {
"name" : "chen zhuangyuan",
"age" : 27,
"join_date" : "2018-01-01",
"phone" : "18823450001",
"country" : "CN",
"province" : "guangdong",
"city" : "guangzhou",
"remark" : "I'm zhuangyuan,I like elasticsearch"
}
}
]
}
}
3.13.20 explian評分分析
從elasticsearch的搜索結果顯示來看,展現給我們的是一個按score得分從高到底排好序的結果集。_explain用來幫助分析文檔的score是如何計算出來的。
GET user_index/user_type/_search
{
"query": {
"match": {
"remark": "elasticsearch"
}
},
"explain": true
}
Response:
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 3,
"successful" : 3,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 2,
"max_score" : 1.0594962,
"hits" : [
{
"_shard" : "[user_index][0]",
"_node" : "SQYgJvIZR7yqA3TzkURejA",
"_index" : "user_index",
"_type" : "user_type",
"_id" : "6",
"_score" : 1.0594962,
"_source" : {
"name" : "liu haoqiang",
"age" : 27,
"join_date" : "2018-06-01",
"phone" : "18823450006",
"country" : "CN",
"province" : "guangdong",
"city" : "guangzhou",
"remark" : "I'm from guangzhou,I like spark and elasticsearch"
},
"_explanation" : {
"value" : 1.059496,
"description" : "weight(remark:elasticsearch in 0) [PerFieldSimilarity], result of:",
"details" : [
{
"value" : 1.059496,
"description" : "score(doc=0,freq=1.0 = termFreq=1.0\n), product of:",
"details" : [
{
"value" : 1.2039728,
"description" : "idf, computed as log(1 + (docCount - docFreq + 0.5) / (docFreq + 0.5)) from:",
"details" : [
{
"value" : 1.0,
"description" : "docFreq",
"details" : [ ]
},
{
"value" : 4.0,
"description" : "docCount",
"details" : [ ]
}
]
},
{
"value" : 0.88,
"description" : "tfNorm, computed as (freq * (k1 + 1)) / (freq + k1 * (1 - b + b * fieldLength / avgFieldLength)) from:",
"details" : [
{
"value" : 1.0,
"description" : "termFreq=1.0",
"details" : [ ]
},
{
"value" : 1.2,
"description" : "parameter k1",
"details" : [ ]
},
{
"value" : 0.75,
"description" : "parameter b",
"details" : [ ]
},
{
"value" : 5.25,
"description" : "avgFieldLength",
"details" : [ ]
},
{
"value" : 7.0,
"description" : "fieldLength",
"details" : [ ]
}
]
}
]
}
]
}
},
{
"_shard" : "[user_index][2]",
"_node" : "SQYgJvIZR7yqA3TzkURejA",
"_index" : "user_index",
"_type" : "user_type",
"_id" : "1",
"_score" : 0.6931472,
"_source" : {
"name" : "chen zhuangyuan",
"age" : 27,
"join_date" : "2018-01-01",
"phone" : "18823450001",
"country" : "CN",
"province" : "guangdong",
"city" : "guangzhou",
"remark" : "I'm zhuangyuan,I like elasticsearch"
},
"_explanation" : {
"value" : 0.6931472,
"description" : "weight(remark:elasticsearch in 0) [PerFieldSimilarity], result of:",
"details" : [
{
"value" : 0.6931472,
"description" : "score(doc=0,freq=1.0 = termFreq=1.0\n), product of:",
"details" : [
{
"value" : 0.6931472,
"description" : "idf, computed as log(1 + (docCount - docFreq + 0.5) / (docFreq + 0.5)) from:",
"details" : [
{
"value" : 1.0,
"description" : "docFreq",
"details" : [ ]
},
{
"value" : 2.0,
"description" : "docCount",
"details" : [ ]
}
]
},
{
"value" : 1.0,
"description" : "tfNorm, computed as (freq * (k1 + 1)) / (freq + k1 * (1 - b + b * fieldLength / avgFieldLength)) from:",
"details" : [
{
"value" : 1.0,
"description" : "termFreq=1.0",
"details" : [ ]
},
{
"value" : 1.2,
"description" : "parameter k1",
"details" : [ ]
},
{
"value" : 0.75,
"description" : "parameter b",
"details" : [ ]
},
{
"value" : 4.0,
"description" : "avgFieldLength",
"details" : [ ]
},
{
"value" : 4.0,
"description" : "fieldLength",
"details" : [ ]
}
]
}
]
}
]
}
}
]
}
}
3.3.21 _analyze分詞分析
_analyz是Elasticsearch一個非常有用的API,它可以幫助你分析每一個field或者某個analyzer/tokenizer是如何分析和索引一段文字。返回結果字段含義:
token:是一個實際被存儲在索引中的詞
position:指明詞在原文本中是第幾個位置出現的
start_offset,end_offset:表示詞在原文本中佔據的位置。
GET user_index/_analyze
{
"analyzer": "standard",
"text": "I'm from shenzhen,I like elasticsearch,spark and hbase"
}
Response:
{
"tokens" : [
{
"token" : "i'm",
"start_offset" : 0,
"end_offset" : 3,
"type" : "<ALPHANUM>",
"position" : 0
},
{
"token" : "from",
"start_offset" : 4,
"end_offset" : 8,
"type" : "<ALPHANUM>",
"position" : 1
},
{
"token" : "shenzhen",
"start_offset" : 9,
"end_offset" : 17,
"type" : "<ALPHANUM>",
"position" : 2
},
{
"token" : "i",
"start_offset" : 18,
"end_offset" : 19,
"type" : "<ALPHANUM>",
"position" : 3
},
{
"token" : "like",
"start_offset" : 20,
"end_offset" : 24,
"type" : "<ALPHANUM>",
"position" : 4
},
{
"token" : "elasticsearch",
"start_offset" : 25,
"end_offset" : 38,
"type" : "<ALPHANUM>",
"position" : 5
},
{
"token" : "spark",
"start_offset" : 39,
"end_offset" : 44,
"type" : "<ALPHANUM>",
"position" : 6
},
{
"token" : "and",
"start_offset" : 45,
"end_offset" : 48,
"type" : "<ALPHANUM>",
"position" : 7
},
{
"token" : "hbase",
"start_offset" : 49,
"end_offset" : 54,
"type" : "<ALPHANUM>",
"position" : 8
}
]
}