You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We have encountered this bug multiple times, also before 2.16.0.
When cluster nodes are already above the low watermark causing new indices being distributed to other nodes, it can happen that the cluster goes to yellow. The cause seems to be that the default policy on system indices is: auto_expand_replicas: "1-all". It tries to allocate replicas to nodes that are not able to accept more data because of the watermark situation.
This seems to happen when kubernetes is reallocating opensearch nodes to different k8s compute nodes.
{
"index": ".opendistro_security",
"shard": 0,
"primary": false,
"current_state": "unassigned",
"unassigned_info": {
"reason": "CLUSTER_RECOVERED",
"at": "2024-09-12T14:45:27.211Z",
"last_allocation_status": "no_attempt"
},
"can_allocate": "no",
"allocate_explanation": "cannot allocate because allocation is not permitted to any of the nodes",
"node_allocation_decisions": [
{
"node_id": "02CeBVQKTa2lD1Qx0GAS3Q",
"node_name": "opensearch-data-nodes-hot-6",
"transport_address": "10.244.33.33:9300",
"node_attributes": {
"temp": "hot",
"shard_indexing_pressure_enabled": "true"
},
"node_decision": "no",
"deciders": [
{
"decider": "disk_threshold",
"decision": "NO",
"explanation": "the node is above the low watermark cluster setting [cluster.routing.allocation.disk.watermark.low=87%], using more disk space than the maximum allowed [87.0%], actual free: [8.175061087167675%]"
}
]
},
{
"node_id": "Balhhxf2T2uNpUP6rq88Ag",
"node_name": "opensearch-data-nodes-hot-2",
"transport_address": "10.244.86.36:9300",
"node_attributes": {
"temp": "hot",
"shard_indexing_pressure_enabled": "true"
},
"node_decision": "no",
"deciders": [
{
"decider": "disk_threshold",
"decision": "NO",
"explanation": "the node is above the low watermark cluster setting [cluster.routing.allocation.disk.watermark.low=87%], using more disk space than the maximum allowed [87.0%], actual free: [9.615515861288957%]"
}
]
},
{
"node_id": "DppvPjxgR0u8CVQVyAX0UA",
"node_name": "opensearch-data-nodes-hot-7",
"transport_address": "10.244.97.29:9300",
"node_attributes": {
"temp": "hot",
"shard_indexing_pressure_enabled": "true"
},
"node_decision": "no",
"deciders": [
{
"decider": "same_shard",
"decision": "NO",
"explanation": "a copy of this shard is already allocated to this node [[.opendistro_security][0], node[DppvPjxgR0u8CVQVyAX0UA], [R], s[STARTED], a[id=Q9PoLV1wRGumidM22EKveQ]]"
},
{
"decider": "disk_threshold",
"decision": "NO",
"explanation": "the node is above the low watermark cluster setting [cluster.routing.allocation.disk.watermark.low=87%], using more disk space than the maximum allowed [87.0%], actual free: [12.463841799195983%]"
}
]
},
{
"node_id": "LQSYXzHbTfqowAOj3nrU3w",
"node_name": "opensearch-data-nodes-hot-4",
"transport_address": "10.244.70.30:9300",
"node_attributes": {
"temp": "hot",
"shard_indexing_pressure_enabled": "true"
},
"node_decision": "no",
"deciders": [
{
"decider": "disk_threshold",
"decision": "NO",
"explanation": "the node is above the low watermark cluster setting [cluster.routing.allocation.disk.watermark.low=87%], using more disk space than the maximum allowed [87.0%], actual free: [7.916677463242952%]"
}
]
},
{
"node_id": "Ls8ptyo7ROGtFeO8hY5c5Q",
"node_name": "opensearch-data-nodes-hot-9",
"transport_address": "10.244.54.37:9300",
"node_attributes": {
"temp": "hot",
"shard_indexing_pressure_enabled": "true"
},
"node_decision": "no",
"deciders": [
{
"decider": "same_shard",
"decision": "NO",
"explanation": "a copy of this shard is already allocated to this node [[.opendistro_security][0], node[Ls8ptyo7ROGtFeO8hY5c5Q], [R], s[STARTED], a[id=j_FrjkN7R0aCEokKa4tjCA]]"
}
]
},
{
"node_id": "O_CCkTbmRtiuJU3cV93EaA",
"node_name": "opensearch-data-nodes-hot-1",
"transport_address": "10.244.83.46:9300",
"node_attributes": {
"temp": "hot",
"shard_indexing_pressure_enabled": "true"
},
"node_decision": "no",
"deciders": [
{
"decider": "disk_threshold",
"decision": "NO",
"explanation": "the node is above the low watermark cluster setting [cluster.routing.allocation.disk.watermark.low=87%], using more disk space than the maximum allowed [87.0%], actual free: [8.445263138130201%]"
}
]
},
{
"node_id": "OfBmEaQsSsuJtJ4TKadLnQ",
"node_name": "opensearch-data-nodes-hot-10",
"transport_address": "10.244.37.46:9300",
"node_attributes": {
"temp": "hot",
"shard_indexing_pressure_enabled": "true"
},
"node_decision": "no",
"deciders": [
{
"decider": "disk_threshold",
"decision": "NO",
"explanation": "the node is above the low watermark cluster setting [cluster.routing.allocation.disk.watermark.low=87%], using more disk space than the maximum allowed [87.0%], actual free: [11.538695394244522%]"
}
]
},
{
"node_id": "RC5KMwpWRMCVrGaF_7oGBA",
"node_name": "opensearch-data-nodes-hot-0",
"transport_address": "10.244.99.67:9300",
"node_attributes": {
"temp": "hot",
"shard_indexing_pressure_enabled": "true"
},
"node_decision": "no",
"deciders": [
{
"decider": "disk_threshold",
"decision": "NO",
"explanation": "the node is above the low watermark cluster setting [cluster.routing.allocation.disk.watermark.low=87%], using more disk space than the maximum allowed [87.0%], actual free: [12.185368398769644%]"
}
]
},
{
"node_id": "S_fk2yqhQQuby8HM4hJXVA",
"node_name": "opensearch-data-nodes-hot-8",
"transport_address": "10.244.45.64:9300",
"node_attributes": {
"temp": "hot",
"shard_indexing_pressure_enabled": "true"
},
"node_decision": "no",
"deciders": [
{
"decider": "disk_threshold",
"decision": "NO",
"explanation": "the node is above the low watermark cluster setting [cluster.routing.allocation.disk.watermark.low=87%], using more disk space than the maximum allowed [87.0%], actual free: [10.432421573093784%]"
}
]
},
{
"node_id": "_vxbOtloQmapzz0DbXBsjA",
"node_name": "opensearch-data-nodes-hot-5",
"transport_address": "10.244.79.58:9300",
"node_attributes": {
"temp": "hot",
"shard_indexing_pressure_enabled": "true"
},
"node_decision": "no",
"deciders": [
{
"decider": "same_shard",
"decision": "NO",
"explanation": "a copy of this shard is already allocated to this node [[.opendistro_security][0], node[_vxbOtloQmapzz0DbXBsjA], [P], s[STARTED], a[id=hY9WcHR-S_6TN3kTj4NZJA]]"
}
]
},
{
"node_id": "pP5muAyTSA2Z45yO8Ws0VA",
"node_name": "opensearch-data-nodes-hot-3",
"transport_address": "10.244.101.66:9300",
"node_attributes": {
"temp": "hot",
"shard_indexing_pressure_enabled": "true"
},
"node_decision": "no",
"deciders": [
{
"decider": "disk_threshold",
"decision": "NO",
"explanation": "the node is above the low watermark cluster setting [cluster.routing.allocation.disk.watermark.low=87%], using more disk space than the maximum allowed [87.0%], actual free: [9.424146099675534%]"
}
]
},
{
"node_id": "zRdO9ndKSbuJ97t77-OLLw",
"node_name": "opensearch-data-nodes-hot-11",
"transport_address": "10.244.113.26:9300",
"node_attributes": {
"temp": "hot",
"shard_indexing_pressure_enabled": "true"
},
"node_decision": "no",
"deciders": [
{
"decider": "same_shard",
"decision": "NO",
"explanation": "a copy of this shard is already allocated to this node [[.opendistro_security][0], node[zRdO9ndKSbuJ97t77-OLLw], [R], s[STARTED], a[id=O7z4RvkiQXGMcfhRSPm8lQ]]"
},
{
"decider": "disk_threshold",
"decision": "NO",
"explanation": "the node is above the low watermark cluster setting [cluster.routing.allocation.disk.watermark.low=87%], using more disk space than the maximum allowed [87.0%], actual free: [11.883587901703455%]"
}
]
}
]
}
So if we have 12 nodes, it tries to allocate 11 replicas on the restart of the node. But that seems to fail because several nodes are above the low watermark (why not distribute the free space more evenly?). The only solutions seems to be to lower the auto expand setting or to manually redistribute shards across the nodes to even out the disk space usage.
Cluster storage state:
n id v r rp dt du dup hp load_1m load_5m load_15m
opensearch-master-nodes-0 twM5 2.16.0 m 60 9.5gb 518.1mb 5.32 56 1.74 1.38 1.17
opensearch-data-nodes-hot-5 _vxb 2.16.0 d 96 960.1gb 649.6gb 67.66 41 1.14 1.14 1.10
opensearch-master-nodes-2 nQD7 2.16.0 m 59 9.5gb 518.1mb 5.32 37 1.15 1.06 1.09
opensearch-data-nodes-hot-11 zRdO 2.16.0 d 92 960.1gb 859gb 89.47 31 2.33 3.13 3.62
opensearch-data-nodes-hot-6 02Ce 2.16.0 d 90 960.1gb 848.5gb 88.38 62 1.40 1.40 1.60
opensearch-data-nodes-hot-4 LQSY 2.16.0 d 95 960.1gb 886.5gb 92.33 35 2.33 2.40 2.56
opensearch-data-nodes-hot-10 OfBm 2.16.0 d 96 960.1gb 861.7gb 89.75 58 3.69 4.27 4.21
opensearch-ingest-nodes-0 bx4Z 2.16.0 i 65 19gb 1016mb 5.21 73 2.31 2.60 2.54
opensearch-data-nodes-hot-3 pP5m 2.16.0 d 61 960.1gb 869.6gb 90.58 35 1.71 1.64 1.89
opensearch-data-nodes-hot-9 Ls8p 2.16.0 d 95 960.1gb 643.2gb 66.99 27 0.72 1.00 1.02
opensearch-data-nodes-hot-7 Dppv 2.16.0 d 91 960.1gb 842.4gb 87.74 53 1.29 1.87 1.74
opensearch-data-nodes-hot-2 Balh 2.16.0 d 63 960.1gb 867.8gb 90.38 31 1.93 1.73 1.45
opensearch-data-nodes-hot-8 S_fk 2.16.0 d 64 960.1gb 859.9gb 89.57 42 0.66 0.66 0.71
opensearch-data-nodes-hot-1 O_CC 2.16.0 d 89 960.1gb 884.9gb 92.17 11 1.53 1.48 1.33
opensearch-data-nodes-hot-0 RC5K 2.16.0 d 85 960.1gb 844.8gb 87.99 62 0.77 0.90 1.10
opensearch-master-nodes-1 r70_ 2.16.0 m 58 9.5gb 518.1mb 5.32 58 0.76 0.88 1.05
opensearch-ingest-nodes-1 NX1N 2.16.0 i 61 19gb 1016mb 5.21 17 0.49 1.12 1.77
Related component
Storage
To Reproduce
Cluster is nearing capacity ( good from a storage cost perspective )
Cluster gets rebooted or individual nodes get rebooted
Cluster goes to yellow state
Expected behavior
Rebalance shards proactively based on storage usage of nodes
System indices might take priority ignoring the low/high watermark untill cluster disk usage really becomes critical
Additional Details
Plugins
Default
Screenshots
N/A
Host/Environment (please complete the following information):
Default 2.16.0 docker images
Additional context
N/A
The text was updated successfully, but these errors were encountered:
Describe the bug
We have encountered this bug multiple times, also before 2.16.0.
When cluster nodes are already above the low watermark causing new indices being distributed to other nodes, it can happen that the cluster goes to yellow. The cause seems to be that the default policy on system indices is: auto_expand_replicas: "1-all". It tries to allocate replicas to nodes that are not able to accept more data because of the watermark situation.
This seems to happen when kubernetes is reallocating opensearch nodes to different k8s compute nodes.
Cluster state:
It tries to allocate the replicas:
So if we have 12 nodes, it tries to allocate 11 replicas on the restart of the node. But that seems to fail because several nodes are above the low watermark (why not distribute the free space more evenly?). The only solutions seems to be to lower the auto expand setting or to manually redistribute shards across the nodes to even out the disk space usage.
Cluster storage state:
Related component
Storage
To Reproduce
Cluster is nearing capacity ( good from a storage cost perspective )
Cluster gets rebooted or individual nodes get rebooted
Cluster goes to yellow state
Expected behavior
Rebalance shards proactively based on storage usage of nodes
System indices might take priority ignoring the low/high watermark untill cluster disk usage really becomes critical
Additional Details
Plugins
Default
Screenshots
N/A
Host/Environment (please complete the following information):
Default 2.16.0 docker images
Additional context
N/A
The text was updated successfully, but these errors were encountered: