承接上文
【ES】ES中的join方案一(Nested類型,基於6.3版本的java實現):https://blog.csdn.net/lsr40/article/details/102398379
上文說到ES中的join有兩種實現,上文把Nested類型的實現說了,本文要寫的是通過設置join的字段,來關聯不同文檔,通過設置的parent和child,來實現父子關係。
等下,爲什麼要有父子關係?
我們通過一個業務場景來理解
我有兩張表:
表A:文章內容表
表B:發佈文章的用戶基礎信息
需求:想知道提到“NBA”(最近NBA很“火”。。。)的用戶都有哪些(要查詢出這些用戶的基礎信息),並且哪一個跟NBA關係更大?
如果使用數據庫實現的話,無非是:把所有文章拿過來like '%NBA%',然後將這些文章的用戶ID去重,再關聯用戶基礎信息表,獲得我們想要的數據(評分的話,可以簡單的通過該用戶提到NBA的文章的次數來統計)
隨便寫個SQL如下:
--cnt越大,排名越高,因爲該用戶提到NBA的文章越多
select b.* from
(select
user_id,count(1) as cnt
from 文章內容表
where 文章內容 like '%NBA%' )a
inner join
發佈文章的用戶基礎信息 b
on a.user_id = b.user_id
order by cnt desc
大家看到其實返回的是用戶的基礎信息,也就是表B
我並不需要知道用戶發了什麼文章,這些文章的內容是什麼(當然可能後續操作需要用到,但是在這個需求中,我是不需要的)
這樣的需求轉換成ES實現
其實就是父文檔是用戶基礎信息,子文檔是用戶發佈的文章,查詢子文檔返回父文檔(父和子是一對多的關係)
如下案例的層級關係:
user_base
|
|
article
|
|
vote
遇到這種需求就可以使用hasChild來實現:
注意:父和子一定要放到同一個routing中否則會索引不到或者出現返回值不唯一
所以一般是用關聯鍵來做routing=關聯鍵id,這樣就會把相同的數據放到同一個routing上
關於routing:https://www.cnblogs.com/tgzhu/p/9167589.html(作者:天戈朱)
##創建一個3層的index
PUT three_tree_index
{
"mappings": {
"_doc": {
"properties": {
"user_name": {
"type": "text"
},
"age": {
"type": "keyword"
},
"my_join_field": {
"type": "join",
"relations": {
"user_base": "article",
"article": "vote"
}
},
"stars": {
"type": "short"
},
"article_desc": {
"type": "text"
}
}
}
}
}
##插入6條數據
PUT three_tree_index/_doc/1?routing=1&refresh
{
"user_name":"xiaoming",
"age":29,
"my_join_field":"user_base"
}
PUT three_tree_index/_doc/2?routing=2&refresh
{
"user_name":"xiaohong",
"age":32,
"my_join_field":"user_base"
}
PUT three_tree_index/_doc/3?routing=1&refresh
{
"article_desc":"xiaoming,article_desc_1",
"my_join_field":{
"name":"article",
"parent":"1"
}
}
PUT three_tree_index/_doc/4?routing=2&refresh
{
"article_desc":"xiaohong,article_desc_1",
"my_join_field":{
"name":"article",
"parent":"2"
}
}
PUT three_tree_index/_doc/5?routing=1&refresh
{
"stars":5,
"my_join_field":{
"name":"vote",
"parent":"3"
}
}
PUT three_tree_index/_doc/6?routing=2&refresh
{
"stars":3,
"my_join_field":{
"name":"vote",
"parent":"4"
}
}
1、查詢文章內容包括“xiaoming”的用戶信息
##查詢文章內容有“xiaoming”的用戶信息
GET three_tree_index/_search
{
"query": {
"has_child": {
"type": "article",
"query": {
"match": {
"article_desc": "xiaoming"
}
}
}
}
}
##可以通過添加"inner_hits":{},將父和子文檔都返回出來
GET three_tree_index/_search
{
"query": {
"has_child": {
"type": "article",
"query": {
"match": {
"article_desc": "xiaoming"
}
},"inner_hits":{}
}
}
}
##查詢有五星評價的文章對應的用戶(嵌套兩層has_child查詢)
GET three_tree_index/_search
{
"query": {
"bool": {
"must": [
{
"has_child": {
"type": "article",
"query": {
"has_child": {
"type": "vote",
"query": {
"bool": {
"should": [
{
"term": {
"stars": 5
}
}
]
}
}
}
}
}
}
]
}
}
}
2、當然has_parent的功能正好相反,比如我想搜索所有包含“xiaoming”的文章的平均星級
GET three_tree_index/_search
{
"query": {
"has_parent": {
"parent_type": "article",
"query": {
"bool": {
"should": [{
"match": {
"article_desc": "xiaoming"
}
}]
}
}
}
},
"aggs": {
"avg_star": {
"avg": {
"field": "stars"
}
}
}
}
3、Java的api:
可以參考官網:https://www.elastic.co/guide/en/elasticsearch/client/java-api/6.3/java-joining-queries.html
/**
* 我就寫了部分的java代碼,權當拋磚引玉吧,如果實在不會寫給我留言咯
* 感覺這個javaAPI還是比較好寫的,就是不太好找到相關的資料
*/
public static void testChildQuery(TransportClient client) {
HasChildQueryBuilder hasParentQueryBuilder =
new HasChildQueryBuilder("article",QueryBuilders.matchQuery("article_desc", "xiaoming"),ScoreMode.Avg);
SearchResponse searchResponse = client.prepareSearch("three_tree_index")
.setTypes("_doc")
.addSort("_score", SortOrder.DESC)
.setQuery(hasParentQueryBuilder)
.setFrom(0).setSize(50).execute().actionGet();
printSearchResponse(searchResponse);
}
public static void testMultiChildQuery(TransportClient client) {
HasChildQueryBuilder hasSecondChildQueryBuilder =
new HasChildQueryBuilder("vote", QueryBuilders.boolQuery()
.should(QueryBuilders.termQuery("stars", 5)),ScoreMode.None);
HasChildQueryBuilder hasFirstChildQueryBuilder =
new HasChildQueryBuilder("article",hasSecondChildQueryBuilder,ScoreMode.None);
SearchResponse searchResponse = client.prepareSearch("three_tree_index")
.setTypes("_doc")
.addSort("_score", SortOrder.DESC)
.setQuery(hasFirstChildQueryBuilder)
.setFrom(0).setSize(50).execute().actionGet();
printSearchResponse(searchResponse);
}
4、其實這個父子關係可以更加的複雜,例如這種:
官網:https://www.elastic.co/guide/en/elasticsearch/reference/6.3/parent-join.html
question
/ \
/ \
comment answer
|
|
vote
##就這麼定義,這是官網的例子
PUT my_index
{
"mappings": {
"_doc": {
"properties": {
"my_join_field": {
"type": "join",
"relations": {
"question": ["answer", "comment"],
"answer": "vote"
}
}
}
}
}
}
5、兩種join的區別(我引用了別人的圖):
引用地址:https://blog.csdn.net/laoyang360/article/details/82950393