Some earlier changes in how we label our AWS AutoScaling Groups (ASGs) and which labels the Kubernetes cluster-autoscaler uses for automatically detecting these ASGs caused the scaler to not work properly. This could result in clusters not automatically removing unneeded nodes, or adding extra ones when more capacity is needed.
We have reviewed all our setups and made sure autodetection labels match, independent from upstream changes. This fix has been rolled out across all our managed clusters.
As a further next step we’re working on extra monitoring to ensure issues with the autoscaler get properly detected.