Hotfix for Grafana and InfoInhibitor alert info

There are no actions to take, and all changes have been rolled out to all environments.

Grafana datasources missing

Last Monday (28/02), we pushed the latest monitoring upgrades to production environments. Quite soon some customer noticed issues with missing Grafana datasources (eg. Prometheus and Loki), which was the result of a race condition between Grafan startup and the used datasource reloader sidecar.

Tuesday we deployed a hotfix for this issue to all clusters. As a side effect, we have disabled (for now) the ability to specify custom Grafana datasources via configmap, however this feature was not documented nor in use in any of our managed environments.

InfoInhibitor alert

Meanwhile you might have also noticed new InfoInhibitor alerts in your channels. This is a special alert, that bundles severity: info alerts together. The goal is to not get notified by this, unless other alerts with a higher severity are firing in the same namespace.

We had initially “muted” this alert, as is the intended goal, however we only applied this for infrastructure namespaces, managed by us. As a result on several environments this alert started firing for customer-managed namespaces. We have deployed an updated config to properly silence and inhibit this alert for all namespaces.

More info can be found in the InfoInhibitor’s runbook.

Extra

Additional updates deployed as part of this release:

Skyscrapers Changelog

Grafana datasources missing

InfoInhibitor alert

Extra

Related Posts

Upgraded cluster add-ons 04 Jul 2025

Skyscrapers report on security incident 02 Jul 2025

Loki label optimisations to improve performance, rollout finished 25 Jun 2025