Every field in a document is indexed and can be queried.
搜索可以分爲以下類型
- structured query on concrete fields like
gender
orage
, sorted by a field likejoin_date
, similar to the type of query that you could construct in SQL(filter,過濾結果可以被緩存) - A full-text query, which finds all documents matching the search keywords, and returns them sorted by relevance(query,全文檢索,返回與搜索條件相關的documen)
- A combination of the two
1.mapping 類似於關係型數據中的表結構。雖然es是no-schema存儲,但mapping的設置會影響到搜索結果。
2.Analysis 文檔是樣被分詞和索引的。
3.Query DSL 用於查詢的語言
GET /_search
空查詢。ES中的檢索可以使用url也可以使用Query DSL 推薦使用Query DSL.
multi-search
分頁查詢
from: size
{
"query":{
"match_all":{
}
},
"from":6,
"size":9
}
分佈式分頁:當數據非常多的時候查詢最後幾頁不推薦使用from size。比如100頁,每頁10條。當查詢第91頁時,需要將第91頁之前所有的數據全部取出來,在取第91頁的數據。海量數據做排序嚴重影響分頁性能。關於分佈式存儲的分頁問題我們以後再會做深入研究。
mapping and analysis
GET /index/_mapping/type
exact values versus full text
在ES中數據分爲兩種類型:數值類型(?真實類型)與文本類型。Exact values are easy to query. The decision is binary; a value either matches the query, or it doesn’t.
真實類型容易查詢。查詢結果只有兩種,符合或者不符合。
全文檢索不容易查詢。語言的語義複雜。程序很難辨別。查詢結果與查詢條件的相關性如何。
inverted index
與索引相對。正向索引是爲了更快速的定位數據。反向索引是將文本分詞,將詞元與文檔建立對應關係。analysis and analyzers
Analysis is a process that consists of the following:
- First, tokenizing a block of text into individual terms suitable for use in an inverted index,
- Then normalizing these terms into a standard form to improve their “searchability,” or recall
How to test a analyzer.
GET /_analyze?analyzer=standard(analyzer name) -d "text to analyze"
mapping
爲了能夠將日期字段作爲日期、數字領域數字,和字符串字段作爲全文或精確值字符串,Elasticsearch需要知道每個字段包含什麼類型的數據。這些信息都包含在映射中()
{
"mappings":{
"index":{
"properties":{
"abc":{
"type":"string",
"index":"analyze",
"analyzer":"ik"
}
}
}
}
}
complex core field types
null
values,
arrays, and objectstype
of
the new field.