一. ES parent-child 文檔簡介
ES 提供了類似數據庫中 Join 聯結的實現,可以通過 Join 類型的字段維護父子關係的數據,其父文檔和子文檔可以單獨維護。
二. 父子文檔的索引創建與數據插入
ES 父子文檔的創建可以分爲下面三步:
- 創建索引 Mapping,指明數據類型爲 join 與父子文檔名
- 插入父文檔
- 插入子文檔
下面針對每一步做演示。
1. 創建索引
假設我們有一個博客系統,每篇博客下有若干條評論,那麼博客 blog 與評論 comment 就構成了一個父子關係。
父子文檔的創建方爲:
- 指定字段類型爲
join
- 通過
relations
指定父子關係
示例如下:
# blog 爲父文檔,comment 爲子文檔
PUT blog_index
{
"mappings": {
"properties": {
"blog_comment_join": {
"type": "join",
"relations": {
"blog": "comment"
}
}
}
}
}
2. 插入父文檔
PUT blog_index/_doc/1
{
"title": "First Blog",
"author": "Ahri",
"content": "This is my first blog",
"blog_comment_join": {
"name": "blog"
}
}
PUT blog_index/_doc/2
{
"title": "Second Blog",
"author": "EZ",
"content": "This is my second blog",
"blog_comment_join": "blog"
}
3. 插入子文檔
插入子文檔時需要注意一點:
routing
設置:子文檔必須要與父文檔存儲在同一分片上,因此子文檔的routing
應該設置爲父文檔 ID 或者與父文檔保持一致
示例代碼如下:
PUT blog_index/_doc/comment-1?routing=1&refresh
{
"user": "Tom",
"content": "Good blog",
"comment_date": "2020-01-01 10:00:00",
"blog_comment_join": {
"name": "comment",
"parent": 1
}
}
PUT blog_index/_doc/comment-2?routing=1&refresh
{
"user": "Jhon",
"content": "Good Job",
"comment_date": "2020-02-01 10:00:00",
"blog_comment_join": {
"name": "comment",
"parent": 1
}
}
PUT blog_index/_doc/comment-3?routing=2&refresh
{
"user": "Jack",
"content": "Great job",
"comment_date": "2020-01-01 10:00:00",
"blog_comment_join": {
"name": "comment",
"parent": 2
}
}
4. 其他
除了上面常見的父子文檔類型,ES Join 還支持 多子文檔 和 多級父子文檔 的設置。如下:
構建多個子文檔
Join 類型一個父文檔可以配置多個子文檔,創建方式如下:
PUT my_index
{
"mappings": {
"properties": {
"my_join_field": {
"type": "join",
"relations": {
"question": ["answer", "comment"]
}
}
}
}
}
構建多級父子關係
PUT my_index
{
"mappings": {
"properties": {
"my_join_field": {
"type": "join",
"relations": {
"question": ["answer", "comment"],
"answer": "vote"
}
}
}
}
}
上面創建的父子文檔層級如下圖所示:
三. 父子文檔的查詢
基於父子文檔的查詢主要有三種:
parent_id
:基於父文檔 ID 查詢所有的子文檔has_parent
:查詢符合條件的父文檔的所有子文檔has_child
:查詢符合條件的子文檔的所有父文檔
下面是具體查詢示例:
【1】parent_id 查詢
# 查詢 ID 爲 1 父文檔的所有子文檔
GET blog_index_parent_child/_search
{
"query": {
"parent_id": {
"type": "comment",
"id": 1
}
}
}
# 結果返回
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 0.44183275,
"hits" : [
{
"_index" : "blog_index",
"_type" : "_doc",
"_id" : "comment-1",
"_score" : 0.44183275,
"_routing" : "1",
"_source" : {
"user" : "Tom",
"content" : "Good blog",
"comment_date" : "2020-01-01 10:00:00",
"blog_comment_join" : {
"name" : "comment",
"parent" : 1
}
}
},
{
"_index" : "blog_index",
"_type" : "_doc",
"_id" : "comment-2",
"_score" : 0.44183275,
"_routing" : "1",
"_source" : {
"user" : "Jhon",
"content" : "Good Job",
"comment_date" : "2020-02-01 10:00:00",
"blog_comment_join" : {
"name" : "comment",
"parent" : 1
}
}
}
]
}
}
【2】has_parent 查詢
# 查詢 title 包含 first 的父文檔的所有子文檔
GET blog_index/_search
{
"query": {
"has_parent": {
"parent_type": "blog",
"query": {
"match": {
"title": "first"
}
}
}
}
}
# 結果返回
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "blog_index",
"_type" : "_doc",
"_id" : "comment-1",
"_score" : 1.0,
"_routing" : "1",
"_source" : {
"user" : "Tom",
"content" : "Good blog",
"comment_date" : "2020-01-01 10:00:00",
"blog_comment_join" : {
"name" : "comment",
"parent" : 1
}
}
},
{
"_index" : "blog_index",
"_type" : "_doc",
"_id" : "comment-2",
"_score" : 1.0,
"_routing" : "1",
"_source" : {
"user" : "Jhon",
"content" : "Good Job",
"comment_date" : "2020-02-01 10:00:00",
"blog_comment_join" : {
"name" : "comment",
"parent" : 1
}
}
}
]
}
}
【3】has_child 查詢
# 查詢 user 包含 Jack 的所有子文檔的父文檔
GET blog_index/_search
{
"query": {
"has_child": {
"type": "comment",
"query": {
"match": {
"user": "Jack"
}
}
}
}
}
# 結果返回
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "blog_index",
"_type" : "_doc",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"title" : "Second Blog",
"author" : "EZ",
"content" : "This is my second blog",
"blog_comment_join" : "blog"
}
}
]
}
}
四. Nested 對象 VS 父子文檔
下面是極客時間課程《Elasticsearch核心技術與實戰》中給出的對比:
一般來說大多數數據還是讀多寫少的,因此大多數時候還是優先使用 Nested 對象。
老鐵都看到這了來一波點贊、評論、關注三連可好
我是 AhriJ鄒同學,前後端、小程序、DevOps 都搞的炸棧工程師。博客持續更新,如果覺得寫的不錯,歡迎來一波老鐵三連,不好的話也歡迎指正,互相學習,共同進步。