这两天使用skywalking,出现了报错,如下:
2020-08-03 14:53:38 2020-08-03 06:53:38,984 - org.apache.skywalking.oap.server.starter.OAPServerBootstrap -10421 [main] ERROR [] - Elasticsearch exception [type=validation_exception, reason=Validation Failed: 1: this action would add [2] total shards, but this cluster currently has [2999]/[3000] maximum shards open;]
2020-08-03 14:53:38 Suppressed: org.elasticsearch.client.ResponseException: method [PUT], host [http://XXXXX:9202], URI [/alarm_record-20200803?master_timeout=30s&timeout=30s], status line [HTTP/1.1 400 Bad Request]
2020-08-03 14:53:38 {"error":{"root_cause":[{"type":"validation_exception","reason":"Validation Failed: 1: this action would add [2] total shards, but this cluster currently has [2999]/[3000] maximum shards open;"}],"type":"validation_exception","reason":"Validation Failed: 1: this action would add [2] total shards, but this cluster currently has [2999]/[3000] maximum shards open;"},"status":400}
从日志错误来看,是因为ES的node上shard数量超出了原来的最大设置导致,新的shard无法创建。
经过网上搜索和不断尝试后,确定有效解决的方法如下:
curl -XPUT -H "Content-Type:application/json" --user user:password -d '{"persistent":{"cluster":{"max_shards_per_node":10000}}}' 'http://ES-host:9202/_cluster/settings'
需要在ES的集群的一个节点上执行上面的命令即可,因为加了访问验证,所以上面增加了user和pwd的参数设置。
从网上搜索的其他方法,如执行一段json,试验后发现还是无效,只能执行上面的命令才行。
执行命令后,会出现下面的提示,说明执行已经成功完成:
{"acknowledged":true,"persistent":{"cluster":{"max_shards_per_node":"10000"}},"transient":{}}
不需要重启ES,执行完后ES会重建index,所以会有一段时间来执行。