ElasticSearch7.2 父子文檔

建立父-子文檔語法

首先看一下如何建立父子文檔,明顯和網上”_parent”的方式不一樣,說明es後期版本已經修改了語法

1
2
3
4
5
6
7
8
9
10
11
12
13
PUT my_index
{
  "mappings": {
    "properties": {
      "my_join_field": { 
        "type": "join",
        "relations": {
          "question": "answer" 
        }
      }
    }
  }
}

 

這段代碼建立了一個my_index的索引,其中my_join_field是一個用於join的字段,type爲join,關係relations爲:父爲question, 子爲answer
至於建立一父多子關係,只需要改爲數組即可:"question": ["answer", "comment"]

插入數據

插入兩個父文檔,語法如下

1
2
3
4
5
6
7
PUT my_index/_doc/1?refresh
{
  "text": "This is a question",
  "my_join_field": {
    "name": "question" 
  }
}

 

同時也可以省略name

1
2
3
4
5
PUT my_index/_doc/1?refresh
{
  "text": "This is a question",
  "my_join_field": "question"
}

 

插入子文檔

子文檔的插入語法如下,注意routing是父文檔的id,平時我們插入文檔時routing的默認就是id
此時name爲answer,表示這是個子文檔

1
2
3
4
5
6
7
PUT /my_index/_doc/3?routing=1
{
  "text": "This is an answer",
  "my_join_field": {
    "name": "answer", 
    "parent": "1" 
  }

 

通過parent_id查詢子文檔

通過parent_id query傳入父文檔id即可

1
2
3
4
5
6
7
8
9
GET my_index/_search
{
  "query": {
    "parent_id": { 
      "type": "answer",
      "id": "1"
    }
  }
}

 

父-子文檔的性能及限制性

父-子文檔主要適用於一對多的實體關係,將其反範式存入文檔中

父-子文檔主要由以下特性:

  • Only one join field mapping is allowed per index.
    每個索引只能有一個join字段
  • Parent and child documents must be indexed on the same shard. This means that the same routing value needs to be provided when getting, deleting, or updating a child document.
    父-子文檔必須在同一個分片上,也就是說增刪改查一個子文檔,必須使用和父文檔一樣的routing key(默認是id)
  • An element can have multiple children but only one parent.
    每個元素可以有多個子,但只有一個父
  • It is possible to add a new relation to an existing join field.
    可以爲一個已存在的join字段添加新的關聯關係
  • It is also possible to add a child to an existing element but only if the element is already a parent.
    可以在一個元素已經是父的情況下添加一個子

總結

es中通過父子文檔來實現join,但在一個索引中只能有一個一父多子的join

關係字段

es會自動生成一個額外的用於表示關係的字段:field#parent
我們可以通過以下方式查詢

1
2
3
4
5
6
7
8
9
10
POST my_index/_search
{
 "script_fields": {
    "parent": {
      "script": {
         "source": "doc['my_join_field#question']" 
      }
    }
  }
}

 

部分響應爲

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "8",
"_score" : 1.0,
"fields" : {
  "parent" : [
    "8"
  ]
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "4",
"_score" : 1.0,
"_routing" : "10",
"fields" : {
  "parent" : [
    "10"
  ]
}
}

 

有_routing字段的說明是子文檔,它的parent字段是父文檔id,如果沒有_routing就是父文檔,它的parent指向當前id

全局序列

父-子文檔的join查詢使用一種叫做全局序列(Global ordinals)的技術來加速查詢,它採用預加載的方式構建,防止在第一次查詢或聚合時出現太長時間的延遲,但在索引元數據改變時重建,父文檔越多,構建時間就越長,重建在refresh時進行,這會造成refresh大量延遲時間(在refresh時也是預加載).
如果join字段很少用,可以關閉這種預加載模式:"eager_global_ordinals": false

全局序列的監控

1
2
3
4
# 每個索引
curl -X GET "localhost:9200/_stats/fielddata?human&fields=my_join_field#question&pretty"
# 每個節點上的每個索引
curl -X GET "localhost:9200/_nodes/stats/indices/fielddata?human&fields=my_join_field#question&pretty"

一父多子的祖孫結構

考慮以下結構

1
2
3
4
5
6
7
   question
    /    \
   /      \
comment  answer
           |
           |
          vote

 

建立索引

1
2
3
4
5
6
7
8
9
10
11
12
13
14
PUT my_index
{
  "mappings": {
    "properties": {
      "my_join_field": {
        "type": "join",
        "relations": {
          "question": ["answer", "comment"],  
          "answer": "vote" 
        }
      }
    }
  }
}

 

插入孫子節點

注意這裏的routing和parent值不一樣,routing指的是祖父字段,即question,而parent指的就是字面意思answer

1
2
3
4
5
6
7
8
PUT my_index/_doc/3?routing=1&refresh 
{
  "text": "This is a vote",
  "my_join_field": {
    "name": "vote",
    "parent": "2" 
  }
}

 

has-child查詢

查詢包含特定子文檔的父文檔,這是一種很耗性能的查詢,儘量少用。它的查詢標準格式如下

1
2
3
4
5
6
7
8
9
10
11
12
13
14
GET my_index/_search
{
    "query": {
        "has_child" : {
            "type" : "child",
            "query" : {
                "match_all" : {}
            },
            "max_children": 10, //可選,符合查詢條件的子文檔最大返回數
            "min_children": 2, //可選,符合查詢條件的子文檔最小返回數
            "score_mode" : "min"
        }
    }
}

 

測試代碼

部分測試代碼如下

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
DELETE my_index

PUT /my_index?pretty
{
  "mappings": {
    "properties": {
      "my_join_field": { 
        "type": "join",
        "relations": {
          "question": "answer" 
        }
      }
    }
  }
}


# 插入父
PUT /my_index/_doc/8?refresh&pretty
{
  "text": "This is a question",
  "my_join_field": {
    "name": "question" 
  }
}

PUT /my_index/_doc/10?refresh&pretty
{
  "text": "This is a new question",
  "my_join_field": {
    "name": "question"
  }
}

PUT /my_index/_doc/12?refresh&pretty
{
  "text": "This is a new question",
  "my_join_field": {
    "name": "question"
  }
}

# 插入子
PUT /my_index/_doc/3?routing=8&refresh&pretty
{
  "text": "This is an answer",
  "my_join_field": {
    "name": "answer", 
    "parent": "8" 
  }
}


PUT /my_index/_doc/4?routing=10&refresh&pretty
{
  "text": "This is another answer",
  "my_join_field": {
    "name": "answer",
    "parent": "10"
  }
}

# 通過parent_id查詢子文檔
GET my_index/_search
{
  "query": {
    "parent_id": { 
      "type": "answer",
      "id": "8"
    }
  }
}

# 查詢relation
POST my_index/_search
{
 "script_fields": {
    "parent": {
      "script": {
         "source": "doc['my_join_field#question']" 
      }
    }
  }
}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章