一、查詢建議介紹
1. 查詢建議是什麼?
查詢建議,爲用戶提供良好的使用體驗。主要包括: 拼寫檢查; 自動建議查詢詞(自動補全)
拼寫檢查如圖:
自動建議查詢詞(自動補全):
2. ES中查詢建議的API
查詢建議也是使用_search端點地址。在DSL中suggest節點來定義需要的建議查詢
示例1:定義單個建議查詢詞
POST twitter/_search { "query" : { "match": { "message": "tring out Elasticsearch" } }, "suggest" : { <!-- 定義建議查詢 --> "my-suggestion" : { <!-- 一個建議查詢名 --> "text" : "tring out Elasticsearch", <!-- 查詢文本 --> "term" : { <!-- 使用詞項建議器 --> "field" : "message" <!-- 指定在哪個字段上獲取建議詞 --> } } } }
示例2:定義多個建議查詢詞
POST _search { "suggest": { "my-suggest-1" : { "text" : "tring out Elasticsearch", "term" : { "field" : "message" } }, "my-suggest-2" : { "text" : "kmichy", "term" : { "field" : "user" } } } }
示例3:多個建議查詢可以使用全局的查詢文本
POST _search { "suggest": { "text" : "tring out Elasticsearch", "my-suggest-1" : { "term" : { "field" : "message" } }, "my-suggest-2" : { "term" : { "field" : "user" } } } }
二、Suggester 介紹
1. Term suggester
term 詞項建議器,對給入的文本進行分詞,爲每個詞進行模糊查詢提供詞項建議。對於在索引中存在詞默認不提供建議詞,不存在的詞則根據模糊查詢結果進行排序後取一定數量的建議詞。
常用的建議選項:
示例1:
POST twitter/_search { "query" : { "match": { "message": "tring out Elasticsearch" } }, "suggest" : { <!-- 定義建議查詢 --> "my-suggestion" : { <!-- 一個建議查詢名 --> "text" : "tring out Elasticsearch", <!-- 查詢文本 --> "term" : { <!-- 使用詞項建議器 --> "field" : "message" <!-- 指定在哪個字段上獲取建議詞 --> } } } }
2. phrase suggester
phrase 短語建議,在term的基礎上,會考量多個term之間的關係,比如是否同時出現在索引的原文裏,相鄰程度,以及詞頻等
示例1:
POST /ftq/_search { "query": { "match_all": {} }, "suggest" : { "myss":{ "text": "java sprin boot", "phrase": { "field": "title" } } } }
結果1:
{ "took": 177, "timed_out": false, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 }, "hits": { "total": 2, "max_score": 1, "hits": [ { "_index": "ftq", "_type": "_doc", "_id": "2", "_score": 1, "_source": { "title": "java spring boot", "content": "lucene is writerd by java" } }, { "_index": "ftq", "_type": "_doc", "_id": "1", "_score": 1, "_source": { "title": "lucene solr and elasticsearch", "content": "lucene solr and elasticsearch for search" } } ] }, "suggest": { "myss": [ { "text": "java sprin boot", "offset": 0, "length": 15, "options": [ { "text": "java spring boot", "score": 0.20745796 } ] } ] } }
3. Completion suggester 自動補全
針對自動補全場景而設計的建議器。此場景下用戶每輸入一個字符的時候,就需要即時發送一次查詢請求到後端查找匹配項,在用戶輸入速度較高的情況下對後端響應速度要求比較苛刻。因此實現上它和前面兩個Suggester採用了不同的數據結構,索引並非通過倒排來完成,而是將analyze過的數據編碼成FST和索引一起存放。對於一個open狀態的索引,FST會被ES整個裝載到內存裏的,進行前綴查找速度極快。但是FST只能用於前綴查找,這也是Completion Suggester的侷限所在。
官網鏈接:
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-suggesters-completion.html
爲了使用自動補全,索引中用來提供補全建議的字段需特殊設計,字段類型爲 completion。
PUT music { "mappings": { "_doc" : { "properties" : { "suggest" : { <!-- 用於自動補全的字段 --> "type" : "completion" }, "title" : { "type": "keyword" } } } } }
Input 指定輸入詞 Weight 指定排序值(可選)
PUT music/_doc/1?refresh { "suggest" : { "input": [ "Nevermind", "Nirvana" ], "weight" : 34 } }
指定不同的排序值:
PUT music/_doc/1?refresh { "suggest" : [ { "input": "Nevermind", "weight" : 10 }, { "input": "Nirvana", "weight" : 3 } ]}
放入一條重複數據
PUT music/_doc/2?refresh { "suggest" : { "input": [ "Nevermind", "Nirvana" ], "weight" : 20 } }
示例1:查詢建議根據前綴查詢:
POST music/_search?pretty { "suggest": { "song-suggest" : { "prefix" : "nir", "completion" : { "field" : "suggest" } } } }
結果1:
{ "took": 25, "timed_out": false, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 }, "hits": { "total": 0, "max_score": 0, "hits": [] }, "suggest": { "song-suggest": [ { "text": "nir", "offset": 0, "length": 3, "options": [ { "text": "Nirvana", "_index": "music", "_type": "_doc", "_id": "2", "_score": 20, "_source": { "suggest": { "input": [ "Nevermind", "Nirvana" ], "weight": 20 } } }, { "text": "Nirvana", "_index": "music", "_type": "_doc", "_id": "1", "_score": 1, "_source": { "suggest": [ "Nevermind", "Nirvana" ] } } ] } ] } }
示例2:對建議查詢結果去重
POST music/_search?pretty { "suggest": { "song-suggest" : { "prefix" : "nir", "completion" : { "field" : "suggest", "skip_duplicates": true } } }}
結果2:
{ "took": 4, "timed_out": false, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 }, "hits": { "total": 0, "max_score": 0, "hits": [] }, "suggest": { "song-suggest": [ { "text": "nir", "offset": 0, "length": 3, "options": [ { "text": "Nirvana", "_index": "music", "_type": "_doc", "_id": "2", "_score": 20, "_source": { "suggest": { "input": [ "Nevermind", "Nirvana" ], "weight": 20 } } } ] } ] } }
示例3:查詢建議文檔存儲短語
PUT music/_doc/3?refresh { "suggest" : { "input": [ "lucene solr", "lucene so cool","lucene elasticsearch" ], "weight" : 20 } } PUT music/_doc/4?refresh { "suggest" : { "input": ["lucene solr cool","lucene elasticsearch" ], "weight" : 10 } }
查詢3:
POST music/_search?pretty { "suggest": { "song-suggest" : { "prefix" : "lucene s", "completion" : { "field" : "suggest" , "skip_duplicates": true } } } }
結果3:
{ "took": 3, "timed_out": false, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 }, "hits": { "total": 0, "max_score": 0, "hits": [] }, "suggest": { "song-suggest": [ { "text": "lucene s", "offset": 0, "length": 8, "options": [ { "text": "lucene so cool", "_index": "music", "_type": "_doc", "_id": "3", "_score": 20, "_source": { "suggest": { "input": [ "lucene solr", "lucene so cool", "lucene elasticsearch" ], "weight": 20 } } }, { "text": "lucene solr cool", "_index": "music", "_type": "_doc", "_id": "4", "_score": 10, "_source": { "suggest": { "input": [ "lucene solr cool", "lucene elasticsearch" ], "weight": 10 } } } ] } ] } }