- 临时移除节点
临时移除节点 的步骤是通过更新集群设置来排除特定节点,以便 Elasticsearch 不再在该节点上分配新的分片。你可以执行以下命令来排除 nodename3:
PUT /_cluster/settings
{"persistent": {"cluster.routing.allocation.exclude._name": "nodename3"}
}
这个命令会通知 Elasticsearch 不要将新的分片分配到 nodename3 上。这是平滑下线节点的第一步。
2.提高分片分配的优先级(可选)
如果希望加快分片重新分配的速度,可以临时提高集群的恢复优先级:
PUT /_cluster/settings
{"transient": {"cluster.routing.allocation.node_concurrent_recoveries": 4,"cluster.routing.allocation.node_initial_primaries_recoveries": 4}
}
这个配置将增加每个节点同时进行分片恢复的并发数,从而加快恢复速度。请注意,这可能会增加节点的负载。
集群转台变为green或者yellow,
如果希望撤销先前设置的 transient 配置,可以将这些配置重置为默认值。具体操作如下:
PUT /_cluster/settings
{"transient": {"cluster.routing.allocation.node_concurrent_recoveries": null,"cluster.routing.allocation.node_initial_primaries_recoveries": null}
}
将配置项的值设置为 null 会移除这些临时设置,恢复为默认值。
- 等待分片重新分配完成
等待分片重新分配完成 是指在排除节点后, 需要监控集群的状态,确保分片已经从 nodename3 节点成功地重新分配到其他节点上。可以使用以下命令来检查集群的健康状态:
GET /_cluster/health?pretty
{"cluster_name" : "clustername","status" : "green","timed_out" : false,"number_of_nodes" : 2,"number_of_data_nodes" : 2,"active_primary_shards" : 24,"active_shards" : 48,"relocating_shards" : 0,"initializing_shards" : 0,"unassigned_shards" : 0,"delayed_unassigned_shards" : 0,"number_of_pending_tasks" : 0,"number_of_in_flight_fetch" : 0,"task_max_waiting_in_queue_millis" : 0,"active_shards_percent_as_number" : 100.0
}
关注以下字段:
unassigned_shards: 这个字段显示未分配的分片数量。它应该随着重新分配的进行而减少。
status: 集群的状态应该从 red 变为 yellow 或 green,表明分片已经成功分配。
此外, 还可以使用以下命令查看每个分片的状态和分配情况:
GET /_cat/shards?v
这个命令会列出所有分片及其当前状态,帮助你确认分片是否已经重新分配。
index shard prirep state docs store ip node
.transform-internal-007 0 r STARTED 3 25.9kb 192.168.43.63 nodename3
.transform-internal-007 0 p STARTED 3 25.9kb 192.168.43.50 nodename2
.kibana_7.13.2_001 0 p STARTED 1456 4.4mb 192.168.43.185 nodename1
.kibana_7.13.2_001 0 r STARTED 1456 4.4mb 192.168.43.63 nodename3
.kibana-event-log-7.13.2-000001 0 r STARTED 7 22.7kb 192.168.43.185 nodename1
.kibana-event-log-7.13.2-000001 0 p STARTED 7 22.7kb 192.168.43.50 nodename2
.security-7 0 p STARTED 55 269.2kb 192.168.43.185 nodename1
.security-7 0 r STARTED 55 269.2kb 192.168.43.50 nodename2
.ds-ilm-history-5-2024.08.10-000001 0 r STARTED 192.168.43.185 nodename1
.ds-ilm-history-5-2024.08.10-000001 0 p STARTED 192.168.43.63 nodename3
.kibana_task_manager_7.13.2_001 0 r STARTED 11 290.9kb 192.168.43.185 nodename1
.kibana_task_manager_7.13.2_001 0 p STARTED 11 241.9kb 192.168.43.50 nodename2
.apm-custom-link 0 r STARTED 0 208b 192.168.43.63 nodename3
.apm-custom-link 0 p STARTED 0 208b 192.168.43.50 nodename2
.ds-.slm-history-5-2024.08.24-000001 0 p STARTED 192.168.43.185 nodename1
.ds-.slm-history-5-2024.08.24-000001 0 r STARTED 192.168.43.63 nodename3
.tasks 0 r STARTED 8 42.6kb 192.168.43.185 nodename1
.tasks 0 p STARTED 8 42.6kb 192.168.43.50 nodename2
.fleet-policies-7 0 p STARTED 2 8.5kb 192.168.43.185 nodename1
.fleet-policies-7 0 r STARTED 2 8.5kb 192.168.43.63 nodename3
.apm-agent-configuration 0 r STARTED 0 208b 192.168.43.63 nodename3
.apm-agent-configuration 0 p STARTED 0 208b 192.168.43.50 nodename2
abc 3 p STARTED 0 208b 192.168.43.63 nodename3
abc 3 r STARTED 0 208b 192.168.43.50 nodename2
abc 5 r STARTED 0 208b 192.168.43.185 nodename1
abc 5 p STARTED 0 208b 192.168.43.63 nodename3
abc 8 r STARTED 1 3.4kb 192.168.43.63 nodename3
abc 8 p STARTED 1 3.4kb 192.168.43.50 nodename2
abc 2 r STARTED 0 208b 192.168.43.185 nodename1
abc 2 p STARTED 0 208b 192.168.43.50 nodename2
abc 4 r STARTED 0 208b 192.168.43.63 nodename3
abc 4 p STARTED 0 208b 192.168.43.50 nodename2
abc 1 p STARTED 0 208b 192.168.43.185 nodename1
abc 1 r STARTED 0 208b 192.168.43.50 nodename2
abc 9 p STARTED 0 208b 192.168.43.185 nodename1
abc 9 r STARTED 0 208b 192.168.43.63 nodename3
abc 6 p STARTED 0 208b 192.168.43.63 nodename3
abc 6 r STARTED 0 208b 192.168.43.50 nodename2
abc 7 r STARTED 1 3.4kb 192.168.43.185 nodename1
abc 7 p STARTED 1 3.4kb 192.168.43.63 nodename3
abc 0 p STARTED 0 208b 192.168.43.185 nodename1
abc 0 r STARTED 0 208b 192.168.43.50 nodename2
.transform-notifications-000002 0 p STARTED 192.168.43.63 nodename3
.transform-notifications-000002 0 r STARTED 192.168.43.50 nodename2
.kibana_security_session_1 0 r STARTED 192.168.43.185 nodename1
.kibana_security_session_1 0 p STARTED 192.168.43.50 nodename2
metrics-endpoint.metadata_current_default 0 r STARTED 0 208b 192.168.43.185 nodename1
metrics-endpoint.metadata_current_default 0 p STARTED 0 208b 192.168.43.63 nodename3
!!!如果集群状态依旧是红色,那么很有可能是有的索引是红色状态,
原因可能是这些红色状态的索引是没有副本的,会导致这些索引没办法重新分片,
这个时候只能删除索引然后重新索引,
然后集群才会变为绿黄色,也就是建立索引最好是加一个副本方便后面运维
3.集群的状态应该从 red 变为 yellow 或 green后,就可以把要下线的节点停机了,然后可以更改正常节点的 elasticsearch.yml,一个一个改过去,然后重启正常节点或者下次重启正常节点都可以
4.如果想用原来挂的节点ip加入集群,并且这个节点已经没有了原来的数据,那么可以使用如下命令:
PUT /_cluster/settings
{"persistent": {"cluster.routing.allocation.exclude._name": null}
}
在新的 nodename3 启动并加入集群后,执行以下命令来撤销之前的排除操作,允许 Elasticsearch 将分片重新分配到新的 nodename3 节点上: