ElasticSearch聚合和OR條件查詢

在使用ElasticSearch中經常會遇到統計、查詢需求,實現類似sql分組計算、條件查詢的語法,ES在這些方面都支持的較不錯,用起來也比較方便。筆者就自己開發中用到的Java API舉例如下。

分組聚合查詢

實現類似select avg(executionTime) from t WHERE executionDate between 'A' AND 'B' group by executionDate ORDER BY executionDate ASC的查詢後聚合。ES分組計算的邏輯是先將executionDate分組,然後在每個分組內在進行聚合,在API中是兩個聚合aggregation包含的邏輯關係。

  • HTTP請求
    POST請求體
{
	"from":0,
	"size":2,
  "query": {
    "range": {
      "executionDate": {
      	"gte":"2019-01-20",
      	"lte":"2019-01-25"
      }
    }
  },
  "aggregations":{
  	"executionDateGroup":{
  		"terms":{
  			"field":"executionDate"
  		},
  		"aggregations":{
	  		"executionTimeAvg":{
		  		"avg":{
		  			"field":"executionTime"
		  		}
	  		}
  		}
  	}
  	
  }
}

POST返回結果

{
    "took": 1510,
    "timed_out": false,
    "_shards": {
        "total": 5,
        "successful": 5,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": 52,
        "max_score": 1,
        "hits": [
            //....
        ]
    },
    "aggregations": {
        "executionDateGroup": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 2,
            "buckets": [
                {
                    "key": 1548425669000,
                    "key_as_string": "2019-01-25 14:14:29",
                    "doc_count": 6,
                    "executionTimeAvg": {
                        "value": 72.33333333333333
                    }
                },
                {
                    "key": 1548425670000,
                    "key_as_string": "2019-01-25 14:14:30",
                    "doc_count": 6,
                    "executionTimeAvg": {
                        "value": 67.66666666666667
                    }
                },
                //.....
            ]
        }
    }
}

JAVA代碼API其實與上面HTTP請求一樣封裝和獲取數據,整體思路變化不大。
代碼示例

    //1.先對executionDate分組,取名executionDateGroup,即實現group by executionDate,如果不設置size,則聚合返回默認10組數據
    TermsAggregationBuilder groupTerms = AggregationBuilders.terms("executionDateGroup").field("executionDate").size(Integer.MAX_VALUE);
    //設置排序 true爲正序、flase爲倒序,實現ORDER BY executionDate ASC
    groupTerms.order(BucketOrder.key(true));
    
    //2.聚合avg(executionTime),取名executionTimeAvg
    AvgAggregationBuilder timeAvg = AggregationBuilders.avg("executionTimeAvg").field("executionTime");
    
    //3.兩個aggregation父子關係
    groupTerms.subAggregation(timeAvg);
    
    //查詢條件
    BoolQueryBuilder queryBuilder = QueryBuilders.boolQuery();
    RangeQueryBuilder rangeDateQuery = QueryBuilders.rangeQuery("executionDate").gte(DateFormatUtils.format(fromDate, "yyyy-MM-dd HH:mm:ss"))
            .lte(DateFormatUtils.format(toDate, "yyyy-MM-dd HH:mm:ss"));
    queryBuilder.must(rangeDateQuery);
    //如果方法名不爲空,則添加查詢條件
    if (StringUtils.isNotBlank(condition.getMethodName())) {
        queryBuilder.must(new TermQueryBuilder("methodName", condition.getMethodName()));
    }
    SearchResponse searchRes = esClient.prepareSearch(CommonConstants.EHR_EXCUTION_AOP_LOG_INDEX)
            .setTypes(CommonConstants.EHR_EXCUTION_AOP_LOG_TYPE)
            .setQuery(queryBuilder)
            .addAggregation(groupTerms)  //設置聚合
            .addSort("executionDate", SortOrder.ASC)
            .get();
    
    // 獲取結果,提取結果的順序與aggregation組合關係一致,先提取外層executionDateGroup,再提取多個分組的子層executionTimeAvg
    Aggregation executionDateGroup = searchRes.getAggregations().get("executionDateGroup");
    Terms timeAvgTerms = null;
    if (executionDateGroup instanceof Terms) {
        //外層aggregation
        timeAvgTerms = (Terms) executionDateGroup;
        List<? extends Terms.Bucket> buckets = timeAvgTerms.getBuckets();
        for (Terms.Bucket elem : buckets) {
            //子級aggregation
            InternalAvg executionTimeAvg = (InternalAvg) elem.getAggregations().get("executionTimeAvg");
            ExecutionInfoDto executionInfoDto = new ExecutionInfoDto();
            executionInfoDto.setExecutionDate(elem.getKeyAsString());
            executionInfoDto.setExecutionTime((int)executionTimeAvg.getValue());
            executionInfoDtoList.add(executionInfoDto);
        }
    }

OR條件查詢

ES條件查詢中有MUST/MUSTNOT/SHOULD邏輯,其中MUST/MUSTNOT與sql中的AND/NOT理解和用法基本一致,但SHOULD則與sql中的OR有些不一樣,在ES中如果要表示OR查詢,則需要配合MUST一起使用,即MUST(SHOULD A, SHOULD B),表示A OR B
HTTP POST請求體

{
    "query": {
        "bool": {
            "must": {
                //or條件組裝
                "bool" : { 
                    "should": [
                        { "match": { "about": "music" }},
                        { "match": { "about": "climb" }} ] 
                }
            },
            "must": {
                "match": { "first_nale": "John" }
            },
            "must_not": {
                "match": {"last_name": "Smith" }
            }
        }
    }
}

JAVA代碼示例

    BoolQueryBuilder pinyinQuery = QueryBuilders.boolQuery();
    if (!keyword.contains("-")) {
        pinyinQuery.should(QueryBuilders.matchQuery("userName.pinyin", pinyin));
    }
    pinyinQuery.should(QueryBuilders.termQuery("userId", keyword));
          
    BoolQueryBuilder queryEs = QueryBuilders.boolQuery()
            .must(pinyinQuery)
            .must(new TermQueryBuilder("officeStatus", "1"));
    // 構造highlight
    HighlightBuilder hiBuilder= new HighlightBuilder();
    hiBuilder.preTags("<h2>").postTags("</h2>").field("userName.pinyin");
    SearchResponse scrollRes =
            client.prepareSearch(ES_INDEX_NAME)
                    .setTypes(ES_INDEX_USER_TYPE)
                    .setQuery(queryEs)
                    .highlighter(hiBuilder)
                    .setScroll("10s")
                    .setSize(1000)
                    .get();
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章