需求
對明細數據先按waybillId分組,再按eventTime降序,取最新一條數據。
桶聚合(bucket)
Elasticsearch桶聚合,目的就是數據分組,先將數據按指定的條件分成多個組,然後對每一個組進行統計。
1. ES SQL寫法
{
"query": {
"bool": {
"must": [
{
"match_all": {}
}
]
}
},
"aggs": {
"waybillIdAgg": {
"terms": {
"field": "waybillId",
"size": 1000,
"min_doc_count": 1
},
"aggs": {
"top1": {
"top_hits": {
"size": 1,
"sort": [
{
"eventTime": {
"order": "desc"
}
}
]
}
}
}
}
}
}
返回結果如下:
2. Java Elasticsearch寫法及結果解析
// 查詢條件
BoolQueryBuilder queryBool = QueryBuilders.boolQuery();
BoolQueryBuilder inFilter = new BoolQueryBuilder();
waybillIds.forEach(
waybillId -> inFilter.should(QueryBuilders.termQuery("waybillId", waybillId)));
queryBool.must(inFilter);
// 桶聚合(bucket),按waybillId分組
TermsAggregationBuilder termsAggregationBuilder =
AggregationBuilders.terms("waybillIdAgg").field("waybillId").size(1000).minDocCount(1);
// 嵌套桶,再按時間倒序取第一條數據
TopHitsAggregationBuilder sort =
AggregationBuilders.topHits("top1").size(1).sort("eventTime", SortOrder.DESC);
termsAggregationBuilder.subAggregation(sort);
SearchSourceBuilder searchSourceBuilder =
SearchSourceBuilder.searchSource().query(queryBool).aggregation(termsAggregationBuilder);
// 查詢請求
SearchRequest searchRequest = new SearchRequest(esIndexConfig.getIndexNameTrackingDetail());
searchRequest.source(searchSourceBuilder);
// 執行查詢
SearchResponse searchResponse =
restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
// 處理聚合查詢結果
Aggregations aggregations = searchResponse.getAggregations();
Terms terms = aggregations.get("waybillIdAgg");
List<TrackingDetail> trackingDetails =
terms.getBuckets().stream()
.map(
t -> {
Aggregation top1 = t.getAggregations().get("top1");
Optional<SearchHit> first =
Arrays.stream(((ParsedTopHits) top1).getHits().getHits()).findFirst();
if (first.isPresent()) {
Map<String, Object> trackingDetailMap = first.get().getSourceAsMap();
return BeanUtil.fillBeanWithMap(trackingDetailMap, new TrackingDetail(), false);
}
return null;
})
.collect(Collectors.toList());