You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When an EDS needs to both scale and and be rolled because of an update to the EDS or a cluster update. The Operator should prefer to scale the EDS rather than rolling the pods as extra capacity might be more important than e.g. moving pods to new cluster nodes.
Actual Behavior
Currently the operator always checks if any pods needs to be drained before it consideres scaling the EDS. This means that if you have an EDS with say 20 pods and a cluster update is ongoing, then you could wait for all the 20 pods to be upgraded before a potential scale up could be applied. We saw this in production where an EDS was stuck at 35 pods, but the autoscaler recommended scaling to 48.
Proposed solution
Scale Up
I propose that we always favor scale-up over draining pods for a rolling upgrade. That is; if eds.Spec.Replicas > sts.Spec.Replicas then rescale the STS before doing anything else.
Scale Down
Scale down should generally also be favored over rolling upgrade because it's pointless to upgrade pods which would be scaled down anyway, however, it might make sense to favor moving pods on draining nodes before scaling down to ensure that a pod is moved before a node is forcefully terminated.
The text was updated successfully, but these errors were encountered:
Expected Behavior
When an EDS needs to both scale and and be rolled because of an update to the EDS or a cluster update. The Operator should prefer to scale the EDS rather than rolling the pods as extra capacity might be more important than e.g. moving pods to new cluster nodes.
Actual Behavior
Currently the operator always checks if any pods needs to be drained before it consideres scaling the EDS. This means that if you have an EDS with say 20 pods and a cluster update is ongoing, then you could wait for all the 20 pods to be upgraded before a potential scale up could be applied. We saw this in production where an EDS was stuck at 35 pods, but the autoscaler recommended scaling to 48.
Proposed solution
Scale Up
I propose that we always favor scale-up over draining pods for a rolling upgrade. That is; if
eds.Spec.Replicas > sts.Spec.Replicas
then rescale the STS before doing anything else.Scale Down
Scale down should generally also be favored over rolling upgrade because it's pointless to upgrade pods which would be scaled down anyway, however, it might make sense to favor moving pods on draining nodes before scaling down to ensure that a pod is moved before a node is forcefully terminated.
The text was updated successfully, but these errors were encountered: