Elasticsearch的Aggregation功能也異常強悍。
Aggregation共分爲三種:Metric Aggregations、Bucket Aggregations、 Pipeline Aggregations。下面將分別進行總結。
以下所有內容都來自官網:喜歡原汁原味的參看下方網址,不喜歡英文的參看本人總結。
官網(權威):https://www.elastic.co/guide/en/elasticsearch/reference/2.4/search-aggregations-metrics-avg-aggregation.html
#########################################
1、Metric Aggregations
1>Avg Aggregation #計算出字段平均值
{ "aggs" : { "avg_grade" : { "avg" : { "field" : "grade" } } } }
例子:
GET index/type/_search?search_type=count
{
"query": {
"match_all": {}
},
"aggs": {
"avg_grade": {
"avg": {
"field": "grade"
}
}
}
}
參數:search_type=count 表示只返回aggregation部分的結果。
2>Cardinality Aggregation #計算出字段的唯一值。相當於sql中的distinct
{ "aggs" : { "author_count" : { "cardinality" : { "field" : "author" } } } }
例子:
GET index/type/_search?search_type=count
{
"query": {
"match_all": {}
},
"aggs": {
"author_count": {
"cardinality": {
"field": "author"
}
}
}
}
3>Extended Stats Aggregation #字段的其他屬性,包括最大最小,方差等等。
{ "aggs" : { "grades_stats" : { "extended_stats" : { "field" : "grade" } } } }
例子:GET index/type/_search?search_type=count
{
"query": {
"match_all": {}
},
"aggs": {
"grades_stats": {
"extended_stats": {
"field": "grade"
}
}
}
}
返回值:
{ ... "aggregations": { "grade_stats": { "count": 9, "min": 72, "max": 99, "avg": 86, "sum": 774, "sum_of_squares": 67028, "variance": 51.55555555555556, "std_deviation": 7.180219742846005, "std_deviation_bounds": { "upper": 100.36043948569201, "lower": 71.63956051430799 } } } }
4>Geo Bounds Aggregation #計算出所有的地理座標將會落在一個矩形區域。比如說朝陽區域有很多飯店,我就可以用一個矩形把這些飯店都圈起來,看看範圍。
{ "query" : { "match" : { "business_type" : "shop" } }, "aggs" : { "viewport" : { "geo_bounds" : { "field" : "location", "wrap_longitude" : true } } } }
例子:
GET index/type/_search?search_type=count
{
"query": {
"match_all": {}
},
"aggs": {
"viewport": {
"geo_bounds": {
"field": "location",
"wrap_longitude": true
}
}
}
}
返回值:
{ ... "aggregations": { "viewport": { "bounds": { "top_left": { "lat": 80.45, "lon": -160.22 }, "bottom_right": { "lat": 40.65, "lon": 42.57 } } } } }
註釋:這個矩形區域左上角座標,和右下角座標已經給出。也就是說你查出來的數據將會都落在這個地理範圍內。
5>Geo Centroid Aggregation #計算出所有文檔的大概的中心點。比如說某個地區盜竊犯罪很多,那我這樣就可以看到這片區域到底哪個點(街道)偷盜事件最猖狂。
{ "query" : { "match" : { "crime" : "burglary" } }, "aggs" : { "centroid" : { "geo_centroid" : { "field" : "location" } } } }
例子:
GET index/type/_search?search_type=count
{
"query": {
"match_all": {}
},
"aggs": {
"centroid": {
"geo_centroid": {
"field": "location"
}
}
}
}
6>Max Aggregation #求最大值
{ "aggs" : { "max_price" : { "max" : { "field" : "price" } } } }
例子:
GET index/type/_search?search_type=count
{
"query": {
"match_all": {}
},
"aggs": {
"max_price": {
"max": {
"field": "price"
}
}
}
}
7>Min Aggregation #求最小值
{ "aggs" : { "min_price" : { "min" : { "field" : "price" } } } }
例子:
GET index/type/_search?search_type=count
{
"query": {
"match_all": {}
},
"aggs": {
"min_price": {
"min": {
"field": "price"
}
}
}
}
8>Percentiles Aggregation #百分比統計。可以看出你網站的所有頁面。加載時間的差異。
{ "aggs" : { "load_time_outlier" : { "percentiles" : { "field" : "load_time" } } } }
例子:
GET index/type/_search?search_type=count
{
"query": {
"match_all": {}
},
"aggs": {
"load_time_outlier": {
"percentiles": {
"field": "load_time"
}
}
}
}
返回:可以看出這個網站75%頁面在29毫秒左右就加載完畢了。有5%的頁面超過了60毫秒。
{ ... "aggregations": { "load_time_outlier": { "values" : { "1.0": 15, "5.0": 20, "25.0": 23, "50.0": 25, "75.0": 29, "95.0": 60, "99.0": 150 } } } }
9>Percentile Ranks Aggregation #看看15毫秒和30毫秒內大概有多少頁面加載完。
{ "aggs" : { "load_time_outlier" : { "percentile_ranks" : { "field" : "load_time", "values" : [15, 30] } } } }
例子:
GET index/type/_search?search_type=count
{
"query": {
"match_all": {}
},
"aggs": {
"load_time_outlier": {
"percentile_ranks": {
"field": "load_time",
"values": [
15,
30
]
}
}
}
}
返回:看出15毫秒時大概92%頁面加載完畢。30毫秒時基本都加載完成。
{ ... "aggregations": { "load_time_outlier": { "values" : { "15": 92, "30": 100 } } } }
10>Stats Aggregation #最大、最小、和、平均值。一起求出來
{ "aggs" : { "grades_stats" : { "stats" : { "field" : "grade" } } } }
例子:
GET index/type/_search?search_type=count
{
"query": {
"match_all": {}
},
"aggs": {
"grades_stats": {
"stats": {
"field": "grade"
}
}
}
}
11>Sum Aggregation #求和
"aggs" : { "intraday_return" : { "sum" : { "field" : "change" } } }
例子:
GET index/type/_search?search_type=count
{
"query": {
"match_all": {}
},
"aggs": {
"intraday_return": {
"sum": {
"field": "change"
}
}
}
}
12>Top hits Aggregation #較爲常用的統計。獲取到每組前n條數據。相當於sql 中 group by 後取出前n條。
{ "aggs": { "top-tags": { "terms": { "field": "tags", "size": 3 }, "aggs": { "top_tag_hits": { "top_hits": { "sort": [ { "last_activity_date": { "order": "desc" } } ], "_source": { "include": [ "title" ] }, "size" : 1 } } } } } }
例子:取100組,每組只要第一條。爲了見bain沒用order和_source,請自行測試他們。
GET index/type/_search?search_type=count
{
"query": {
"match_all": {}
},
"aggs": {
"all_interests": {
"terms": {
"field": "zxw_id",
"size": 100
},
"aggs": {
"top_tag_hits": {
"top_hits": {
"size": 1
}
}
}
}
}
}
14>Value Count Aggregation #數量統計,看看這個字段一共有多少個不一樣的數值。
{ "aggs" : { "grades_count" : { "value_count" : { "field" : "grade" } } } }
例子:
GET index/type/_search?search_type=count
{
"query": {
"match_all": {}
},
"aggs": {
"grades_count": {
"value_count": {
"field": "grade"
}
}
}
}
2、Bucket Aggregations 這是第二種類型的統計(用的也是最多的,最實用的。)。後續也是抄寫,各位自己看吧。有問題需要討論的=》[email protected]發郵件.
網站:https://www.elastic.co/guide/en/elasticsearch/reference/2.4/search-aggregations-bucket-children-aggregation.html
3、Pipeline Aggregations #這是第三中類型的聚合。