ES-Commands

說明

對一些常用的或在官網文檔中不易找到的 ES 命令，做個簡單整理，方便查找，想起時再做更新和維護。

Ps：大多命令爲基於 5.4.2 版本 ES 集羣。

Commands

索引縮容（減少主分片數）

es-5.4 官方文檔： https://www.elastic.co/guide/en/elasticsearch/reference/5.4/indices-shrink-index.html

首先，修改原索引 test 設置：索引設爲只讀狀態；

PUT /test/_settings
{
  "settings": {
    "index.blocks.write": true 
  }
}

然後，創建目標索引 shrink_test ，並設置目標索引 shrink_test 的分片數和副本數；
目標索引 shrink_test 的分片數必須比原索引 test 的分片數小，且收縮後的分片數必須是原始分片數的一個因數：
如原8個分片可收縮爲 4、2、1，原分片數爲素數（如7,11）則只能收縮爲 1 個分片。

POST test/_shrink/shrink_test
{
  "settings": {
    "index.number_of_replicas": 1,
    "index.number_of_shards": 1, 
    "index.codec": "best_compression" 
  },
  "aliases": {
    "test_alias": {}
  }
}

執行命令後，可通過查看集羣狀態，判斷索引是否已縮容結束，若成功縮容結束，原索引 test 仍存在且處於只讀狀態，可刪除原索引。

索引擴容（增加主分片數）

注：

ES 5.x 版本索引不支持增加主分片數，除非通過 Reindex API 的方式重建索引；

ES 6.x 版本起，索引可支持增加主分片數，但有限制：不支持增量重新分片，本例子以 ES 6.7.2 版本爲例。

Indices APIs » Split Index

es-6.7 官方文檔：https://www.elastic.co/guide/en/elasticsearch/reference/6.7/indices-split-index.html

說明

1、動態擴容分片，在索引創建時必須指定 index.number_of_routing_shards 參數，已創建好的索引，更新 settings 增加 routing_shards 參數或更新 routing_shards 參數，均無法動態擴容分片（即 es 不支持增量重新分片）;

2、動態擴容分片時，必須將源索引阻塞寫操作，擴容結束才能寫

目的：防止 split 過程中出現問題，導致數據丟失

3、可以拆分索引的次數（以及每個原始分片可以拆分成的分片數）由 index.number_of_routing_shards 設置決定。routing_shards 的數量指定內部使用的散列空間，以便在具有一致散列的分片中分發文檔。例如， number_of_routing_shards 設置爲 30（5 x 2 x 3）的5個分片索引可以用因子2或3拆分。換句話說，它可以拆分如下：

5 → 10 → 30 (split by 2, then by 3)
5 → 15 → 30 (split by 3, then by 2)
5 → 30 (split by 6)

原理

拆分的工作原理如下（摘自官網）：

首先，它創建一個新的目標索引，其定義與源索引相同，但主分片數量較多；
然後，它將源索引中的段硬鏈接到目標索引（如果文件系統不支持硬鏈接，則會將所有段複製到新索引中，這是一個更耗時的過程）；
創建低級文件後，所有文檔將hashed，已刪除屬於不同分片的文檔。
最後，恢復目標索引，好像它是一個剛重新打開的封閉索引。

使用

創建源索引 & 阻塞寫操作

PUT my_source_index
	{
		"settings": {
			"index.number_of_shards": 2,
			"index.number_of_routing_shards": 30,
			"index.blocks.write": true
		}
	}

動態擴容分片 & 設置別名

目標索引的分片數必須是 routing_shards 的一個公約數，且大於源索引的分片數。

POST my_source_index/_split/my_target_index?copy_settings=true
{
	"settings":{
		"index.number_of_shards": 10,
		"index.number_of_routing_shards": 30,
	},
	"aliases":{
		"split_index":{}
	}
}

拆分索引前置條件

目標索引不得存在
索引必須具有比目標索引更少的主分片
目標索引中的主分片數必須是源索引中 routing_shards 的一個因子
處理拆分進程的節點必須具有足夠的可用磁盤空間，以容納現有索引的第二個副本

數據遷移（Reindex）

索引數據遷移的要求

1、索引在數據遷移期間，不可有數據變動（新增、刪除、更新），否則變動的數據無法遷移到新索引中；

數據遷移開始時，es會在此時此刻生成一個快照，之後的數據變動，無法體現在快照中，也就無法遷移到新索引中

2、索引在數據遷移期間，有數據變動，但業務可通過重跑任務對遷移期間的數據變動在新索引進行重放，也可以進行遷移；

3、索引在數據遷移期間，有數據變動，且業務無法滿足第2點要求，那麼如果索引只有數據新增，且索引中有字段，可根據此字段進行範圍遷移（如時間字段，則可只遷移2020年這一年的數據到新索引），那麼可先將大部分數據遷至新索引，最後業務停寫任務，將剩餘的少部分數據遷移新索引後，直接寫新索引；

4、以上要求均無法滿足，則不能進行數據遷移。

使用

# 多併發遷移數據
# slices 設置可根據索引的主分片數來設置，如索引主分片數爲5，則 slices=4 表示有5個併發同時進行數據遷移
POST _reindex?slices=2
{
	"source":{
		"index":"index_one",	// 源索引名稱
		"size":5000,	// 這裏的 size ，表示只遷移 5k 條數據，一般不進行此設置；批量刪除時同樣支持指定size（即刪除指定條數的文檔）
		# 遷移條件，無則默認遷移所有數據
		"query":{
			"range":{
				"time":{
					"gte":"xxxx",
					"lte":"xxxx"
				}
			}
		}
	},
	"dest":{
		"index":"index_two"		// 目標索引名稱
	}
}

# 查看當前所有的遷移任務（detailed=true 表示返回任務詳細信息）
GET _tasks?detailed=true&actions=*reindex

# 取消某個遷移任務
POST _tasks/node_id:task_id/_cancel

# 取消全部的遷移任務
POST _tasks/_cancel?actions=*reindex

# 取消全部的刪除數據任務
POST _tasks/_cancel?actions=*delete/byquery

調整 ES 集羣索引分片分配權重

index 權重高，表示優先將索引分片分配到不同的節點上；
shard 權重高，表示優先將索引分片分配到節點分片數較少的節點上；
兩者相加必須等於 1；

es 默認，index 0.55，shard 0.45

PUT _cluster/settings
{
  "transient": {
    "cluster": {
      "routing": {
        "allocation.balance.index":0.9,
        "allocation.balance.shard":0.1
      }
    }
  }
}

same_shard.host

PUT _cluster/settings
{
  "persistent": {
    "cluster": {
      "routing": {
        "allocation.same_shard.host": true
      }
    }
  }
}

這個命令用於單機多實例，作用：保證主分片和副本不會同時分配在一個節點上；

這樣單機多實例的節點，也可以直接停機下線。

索引分片延遲分配

PUT _all/_settings
{
  "index.unassigned.node_left.delayed_timeout":"30m"
}

這個命令用於延時恢復未分配的分片，默認一分鐘；

在進行集羣節點滾動重啓時可適當延長此時間以防止分片拷貝動作的發生，減少集羣資源浪費。

元數據變化耗時記錄

PUT _cluster/settings
{
  "persistent": {
    "cluster": {
      "service":{
        "slow_task_logging_threshold":"10ms"
      }
    }
  }
}

設置 master 參數，可以查看日誌中元數據耗時時長，重新選舉master後，下發元數據的時間，與日誌中此項打印的耗時差不多。

查詢索引特定分片

查詢索引特定某個分片的數據：https://www.elastic.co/guide/en/elasticsearch/reference/5.4/search-request-preference.html

GET index/_search?preference=_prefer_nodes:node1

_primary : 只在主分片執行
_primary_first : 在主分片執行，如主分片不可用，在其他分片執行
_replica : 只在副本上執行
_replica_first : 在副本上執行，如果不可用，在其他分片執行
_local : 優先在本地分配的分片上執行
_prefer_nodes : 如適用，優先在提供的節點name上執行
_shards : 將操作限制爲指定的分片；可與其他選項結合使用，但必須首先顯示，eg：_shards:2,3|_primary
_only_nodes : 將操作限制爲指定的節點，後跟節點name

GET _nodes : 返回節點ID，等其餘節點信息

查詢 fielddate

GET _cat/nodes?v&h=name,sm,fm&s=sm:desc
	sm：段數目  fm：fielddate
GET _cat/fielddata?v&h=id,node,field,size&s=size:desc

GET /_stats/fielddata?fields=* //各個分片、索引的fielddata在內存中的佔用情況
GET /_nodes/stats/indices/fielddata?fields=* //每個node的fielddata在內存中的佔用情況
GET /_nodes/stats/indices/fielddata?level=indices&fields=* //每個node中的每個索引的fielddata在內存中的佔用情況

查詢各索引佔用的內存大小

查看各個索引佔用的es內存大小：
GET _cat/indices?v&h=i,pri.memory.total,memory.total&s=memory.total:desc
命令簡寫： GET _cat/indices?v&h=i,pri.memory.total,tm&s=tm:desc
							
幫助命令：GET _cat/indices?help

health                           | h                              | current health status
status                           | s                              | open/close status
index                            | i,idx                          | index name
uuid                             | id,uuid                        | index uuid
pri                              | p,shards.primary,shardsPrimary | number of primary shards
rep                              | r,shards.replica,shardsReplica | number of replica shards
docs.count                       | dc,docsCount                   | available docs
docs.deleted                     | dd,docsDeleted                 | deleted docs
creation.date                    | cd                             | index creation date (millisecond value)
creation.date.string             | cds                            | index creation date (as string)
store.size                       | ss,storeSize                   | store size of primaries & replicas
pri.store.size                   |                                | store size of primaries
memory.total                     | tm,memoryTotal                 | total used memory
pri.memory.total                 |                                | total user memory

GET _cat/indices?v&h=i,p,r,dc,ss,tm,cds,ss

強制段合併 forcemerge

POST index/_forcemerge
POST index/_forcemerge?max_num_segments=1
POST index/_forcemerge?only_expunge_deletes=true

強制合併API接受以下請求參數：

max_num_segments：合併到的段數。要完全合併索引，請將其設置爲1。默認值是將段數目合併爲一半。

only_expunge_deletes：
	在Lucene中，不會從段中刪除文檔，而只是將其標記爲已刪除。在段的合併過程中，將創建一個沒有這些刪除的新段。
	此標誌僅允許合併具有刪除的段。默認爲false。請注意，這不會超過 index.merge.policy.expunge_deletes_allowed 閾值。
	index.merge.policy.expunge_deletes_allowed：默認值爲10，該值用於確定被刪除文檔的百分比，當執行expungeDeletes時，該參數值用於確定索引段是否被合併。

flush：強制合併後是否應該執行刷新。默認爲 true。

查看索引默認參數信息

GET index/_settings?include_defaults=true

說明

Commands

索引縮容（減少主分片數）

索引擴容（增加主分片數）

Indices APIs » Split Index

說明

原理

使用

創建源索引 & 阻塞寫操作

動態擴容分片 & 設置別名

拆分索引前置條件

數據遷移（Reindex）

索引數據遷移的要求

使用

調整 ES 集羣索引分片分配權重

same_shard.host

索引分片延遲分配

元數據變化耗時記錄

查詢索引特定分片

查詢 fielddate

查詢各索引佔用的內存大小

強制段合併 forcemerge

查看索引默認參數信息

[轉帖]使用NMT和pmap解決JVM資源泄漏問題原創

Python實現大麥網搶票的四大關鍵技術點解析

salesforce零基礎學習（一百三十八）零碎知識點小總結（十）

一款開源的.NET程序集反編譯、編輯和調試神器

關於接口協議，你必須要知道這些！

【2024-05-21】以茶會友

Elasticsearch 數據遷移與任務狀態相關 API

殭屍索引 Dangling indices

採集 Elasticsearch 火焰圖

log4j2 常用配置介紹

根據索引狀態文件定位索引名稱

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結