Monitoring upgrades

As part of our regular upgrade cycle, the following Kubernetes cluster components will be updated in the next rollout. These updates are being rolled out to all clusters and will be finished by the end of the week.

More …

Improved monitoring alerts on Slack

We have updated the format of the monitoring Slack notifications. We now differentiate between warning and critical alerts with a visual color and we changed the output of the alert from the summary to the full description to give more information in the alert.

More …

Fixed regression in Elasticsearch monitoring for Prometheus

We discovered an issue with our Elasticsearch monitoring for Prometheus that was introduced a while back in a rutinary chart upgrade. Because of this problem some Elasticsearch metrics were not being reported into Prometheus, like available storage space for example, and as a result there were some problematic situations in an Elasticsearch cluster that we didn’t pick up in time.

More …

Velero S3 backups replication

Our reference solution eks-based Velero backups on AWS S3 now supports automatic replication to an additional S3 bucket on an AWS region of choice. The feature is disabled by default, contact your lead engineer to discuss about enabling it for your cluster(s) if needed.

Monitoring upgrades

As part of our regular upgrade cycle, the following Kubernetes cluster components have been updated. These updates are being rolled out to all clusters and will be finished by the end of the week.

More …

Grafana main dashboard updated

We have fixed a problem in our main Grafana dashboard. Previously we counted the resources for all pods (including those who already completed). This gave an incorrect indication on the cluster usage. Now we filter out the failed and succeeded pods so the dashboard indicates a more correct usage of the cluster.

More …