We’ve improved the looks and the content of the EC2 instance interruption notifications that we receive in Slack.
For those clusters that run on spot instances, AWS can reclaim them at any moment, and when that happens the Pods running on those instances need to be rescheduled on other nodes available. In order to do so in a clean way, there’s a DaemonSet (
aws-node-termination-handler) running on spot nodes that handle the spot termination events and try to reschedule the Pods running on reclaimed nodes before they get terminated. This DaemonSet is also responsible for notifying of such events in Slack.
The default notifications for the
aws-node-termination-handler are not very structured and don’t identify the affected cluster. We’ve updated the notification format in order to make them more informative and more useful.
You’ll see these notifications pop up in the Slack channel shared with us, if configured accordingly. And if you’re not seeing them and you would like to, get in touch with us and we’ll configure them for your cluster(s).