概述
什麼是highlight
Highlight就是我們所謂的高亮,即允許對一個或者對個字段在搜索結果中高亮顯示。比如字體加粗或者字體呈現和其他文本普通顏色等。
爲了執行高亮顯示,該字段必須有實際的內容,並且這個字段必須存儲,即在mapping中store設爲true,不能只存在於內存中,否則系統會自動加載_source字段並匹配相關的列。
三種高亮類型
ES提供了三種高亮類型,Lucene的plain highlighter,以及fast vector highlighter(fvh)以及posting highlighter.
Plain Highlighter
Plain Hightlighter是默認的高亮選擇,由使用Lucene Hightlighter實現的。它主要是試圖反應查詢匹配邏輯。
如果想高亮很多字段,而且帶有複雜的查詢,那麼這個highlight並不是很快的。爲了準確地反映查詢邏輯,它創建了一個很小的內存索引。並通過Lucene的查詢執行計劃來重新運行原始的查詢條件,從而獲得對當前文檔的低級匹配信息,每個字段和每個需要高亮顯示的文檔都會重複這個過程,所以是有性能隱患的。所以需要你換一個hightlight類型
Fast Vector Highlighter
如果我們在mapping中對字段指定了term_vector參數,且參數值是with_positions_offsets,那麼fast vector highlighter 將會替代plain highlighter成爲默認的highlight類型。
它的主要特點:
- 對磁盤的消耗更少
- 將文本切割爲句子,並且對句子進行高亮,效果更好
- 性能比plain highlight高,因爲不需要重新對高亮文本進行分詞
Posting Highlighter
如果我們在mapping裏index_options設置成offsets,這個posting hightlighter將會代替plain highlighter。
它對大文件而言(大於1M),性能更高。
示例
查詢地址信息中含有mill或者Court的記錄,並將它們高亮顯示。
查詢語句如下:
GET /bank/_search
{
"query": {
"bool": {
"should": [
{ "match": { "address": "mill" } },
{ "match": { "address": "Court" } }
]
}
},
"highlight": {
"fields": {
"address": {}
}
}
}
查詢結果如下:
{
"_index" : "bank",
"_type" : "account",
"_id" : "472",
"_score" : 5.4032025,
"_source" : {
"account_number" : 472,
"balance" : 25571,
"firstname" : "Lee",
"lastname" : "Long",
"age" : 32,
"gender" : "F",
"address" : "288 Mill Street",
"employer" : "Comverges",
"email" : "[email protected]",
"city" : "Movico",
"state" : "MT"
},
"highlight" : {
"address" : [
"288 <em>Mill</em> Street"
]
}
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "18",
"_score" : 2.1248586,
"_source" : {
"account_number" : 18,
"balance" : 4180,
"firstname" : "Dale",
"lastname" : "Adams",
"age" : 33,
"gender" : "M",
"address" : "467 Hutchinson Court",
"employer" : "Boink",
"email" : "[email protected]",
"city" : "Orick",
"state" : "MD"
},
"highlight" : {
"address" : [
"467 Hutchinson <em>Court</em>"
]
}
}
發現它會自動在匹配字段上加上<em> </em>
標籤
自定義高亮標籤
語法如下:
"pre_tags": ["<tag1>"],
"post_tags": ["</tag2>"],
查詢語句如下:
GET /bank/_search
{
"query": {
"bool": {
"should": [
{ "match": { "address": "mill" } },
{ "match": { "address": "Court" } }
]
}
},
"highlight": {
"pre_tags": ["<a>"],
"post_tags": ["</a>"],
"fields": {
"address": {}
}
}
}
查詢結果如下:
{
"_index" : "bank",
"_type" : "account",
"_id" : "472",
"_score" : 5.4032025,
"_source" : {
"account_number" : 472,
"balance" : 25571,
"firstname" : "Lee",
"lastname" : "Long",
"age" : 32,
"gender" : "F",
"address" : "288 Mill Street",
"employer" : "Comverges",
"email" : "[email protected]",
"city" : "Movico",
"state" : "MT"
},
"highlight" : {
"address" : [
"288 <a>Mill</a> Street"
]
}
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "18",
"_score" : 2.1248586,
"_source" : {
"account_number" : 18,
"balance" : 4180,
"firstname" : "Dale",
"lastname" : "Adams",
"age" : 33,
"gender" : "M",
"address" : "467 Hutchinson Court",
"employer" : "Boink",
"email" : "[email protected]",
"city" : "Orick",
"state" : "MD"
},
"highlight" : {
"address" : [
"467 Hutchinson <a>Court</a>"
]
}
}
發現高亮標籤已經被替換