詳細文檔:https://yuque.antfin-inc.com/aligamesmw/es/bkaie0
簡介:
reindex是es官方自帶的一個支持跨集羣數據遷移的工具,也可以同數據庫下進行表之間的數據同步;配置簡單。
特點:
1、reindex遷移是使用數據寫入時的snapshot,因此數據不能實時遷移,需要先全量,然後在增量更新,並且期間業務需要停寫,讀不影響;
2、一般1G的數據量大概需要10min,相對的50萬條記錄大概需要10min;
3、遷移只能索引對索引;
遷移命令(此爲跨集羣的,此爲目標集羣執行的指令,host要寫源集羣的coo節點):
# 在目標集羣執行 POST _reindex?wait_for_completion=false { "source": { "remote": { "host": "http://10.26.33.57:9200" }, "index": "redis_stat_es" }, "dest": { "index": "ngrade-data_index_9" } }
跨集羣是會需要配置白名單,需要將源數據集羣的一個coordinate節點的"IP:PORT"添加到目標集羣的coordinate節點裏,並重啓進程,:
# 20190627 添加 reindex 白名單 reindex.remote.whitelist: ["11.5.23.14:9200", "11.156.9.171:9200"]
如果遇到以下報錯,
{ "error": { "root_cause": [ { "type": "illegal_state_exception", "reason": "scripts of type [inline], operation [update] and lang [painless] are disabled" } ], "type": "illegal_state_exception", "reason": "scripts of type [inline], operation [update] and lang [painless] are disabled" }, "status": 500
需要修改配置,參考:
---------------
同集羣的,不用加remote,只有source和dest
POST _reindex?wait_for_completion=false { "source": { "index": "redis_stat_es" }, "dest": { "index": "ngrade-data_index_9" } }
URL參數說明
wait_for_completion=false,表示請求提交成功後即可返回,後臺執行任務。通常reindex比較耗時,推薦後臺執行。
Body字段說明
source:是源集羣相關信息; remote:是源集羣http協議的地址,比如我們上面配置白名單中的源集羣的coordinate地址; index:源集羣中的索引名稱; dest:是目標集羣相關信息; index:目標集羣中的索引名稱;
更多參數參考官網:https://www.elastic.co/guide/en/elasticsearch/reference/5.3/docs-reindex.html
------
curl提交方法舉例;
curl -XPOST 'http://localhost:9200/_reindex?wait_for_completion=false&pretty' -H 'Content-Type: application/json' -d '{"source": {"index": "cyapp-song-import_usercustomclip_perf"},"dest": {"index": "cyapp-song-import_usercustomclip_readonly"}}'
檢查方法:
curl -XGET http://localhost:9200/_tasks?detailed=true&actions=*reindex curl -XGET http://localhost:9200/_tasks/2iF4iwH9Qp-EDiRMGJZtuQ:1781186699
可以加“size:*” 進行限速
POST _reindex { "size": 10000, "source": { "index": "twitter", "sort": { "date": "desc" } }, "dest": { "index": "new_twitter" } }
-----------
遷移輔助命令:
GET _tasks?detailed=true&actions=*reindex #查看所有在運行的reindex任務
GET .tasks/task/_search #使用wait_for_completion=false後,ES會創建一個.task索引來存儲task結果,可以查看
GET /_tasks/BASGS3wUReOITmwfZtbmag:440973 #通過taskId查看任務詳情
POST _tasks/BASGS3wUReOITmwfZtbmag:440973/_cancel #取消任務
------------
附一條業務相對複雜的遷移DSL用例:
{ "conflicts": "proceed", "source": { "remote": { "host": "http://11.5.23.14:9200" }, "size": 5000, "index": "wpk2-server_gymbo", "query": { "bool": { "filter": [ { "range": { "uploadDate": { "from": "20190101", "to": "20190201", "include_uppper": false, "include_lower": true } } } ] } } }, "dest": { "index": "wpk-server_app_wbr0_201901", "op_type": "create" }, "script": { "inline": "ctx._source.wpk_appid = 'gymbo'", "lang": "painless" } }