Increased monitoring alerts visibility

During the following days we’re going to rollout some changes in how Kubernetes monitoring notifications are delivered. From now on, all notifications comming from the production k8s monitoring system will be shown in our shared slack channel, that is the channel we share with each of our customers. The current notification channels will still work as until now. Here’s an overview of how notifications will work:

  • critical infrastructure notifications will be delivered to:
    • Skyscrapers on-call alerting system
    • Shared slack channel with customer
  • warning infrastructure notifications will be delivered to:
    • Skyscrapers internal slack channel
    • Shared slack channel with customer
  • info infrastructure notifications will be muted. They’ll still show in both Prometheus and Alertmanager dashboards.
  • application alert notifications will be delivered to:
    • Customer internal slack channel
    • Shared slack channel with customer

The goal of these changes is to increase transparency of what’s happening in the cluster, encourage discussion and improve reaction time.

We decided to not apply these changes for staging notifications for now, as we considered it would be too verbose and would clutter the Slack channel.

As always, we’re open to suggestions and feedback.