Improved EC2 instance interruption notifications
We’ve improved the looks and the content of the EC2 instance interruption notifications that we receive in Slack.
More …We’ve improved the looks and the content of the EC2 instance interruption notifications that we receive in Slack.
More …We have started rolling out AKS and EKS 1.21. This brings both our supported AKS and EKS platforms on Kubernetes v1.21.2
.
We have upgraded Istio on all clusters that use it. The version was upgraded from 1.10.0 to 1.11.2. The new version comes with some features meant for operators and no breaking changes that you should be concerned of.
More …We have muted the critical KubeAPIErrorBudgetBurn alerts.
More …During the last year we have tested out the Vertical Pod Autoscaler on several of our workloads and customers. These results were positive and therefore we decided to roll out the VPA on all our clusters.
More …Last month we rolled out a major Grafana update, going from 7.5 to 8.1. While initially everything looked in order, some customers experienced issues with their custom dashboards which were working perfectly in the previous release. Mainly data coming from SQL data sources, visualized through the “old” graph panel, is sorted differently or got visualized completely wrong.
More …We’ve upgraded Cert-manager to the version 1.4.4 on all our Kubernetes clusters. This patch upgrade contains a bug-fix for a renewal time issue that affected some of our clusters.
More …We are in the process of upgrading our Kubernetes based Vault setups to the latest version 1.8.2
.
We are in the process of upgrading our Kubernetes based Vault setups to the latest version 1.8.1
.
As part of our regular upgrade cycle, the following Kubernetes cluster components have been updated. We’ve already rolled these out to all clusters.
More …In an effort to optmize as much as possible the resources being used by the infrastrucutre components running on our Reference Solution Kubernetes platforms, we’ve considerably reduced the memory used by the Cluster Autoscaler by optimizing its configuration. This means that the autoscaler will now run more reliably, and that there’ll be a bit more memory available for other workloads running on the K8s clusters.
More …We’ve upgraded all Teleport clusters to version 6.2.8. Coming from version 4.x, this is a (double) major release, coming with many new features:
More …We have upgraded our Concourse setups to the latest version 7.3.2.
More …We’re switching to encrypted volumes for all our Prometheus and Alertmanager set ups. These were the last of our managed infrastructure components to receive encryption at rest.
More …In order to improve the resilience of CoreDNS during upgrades, we have added a Pod Disruption Budget for the CoreDNS pods. The CoreDNS deployment already has a proper update strategy and anti affinity applied, however the extra PDB will prevent the posibility where there are no available CoreDNS pods running in the cluster during a rolling upgrade.
Telepresence is an open source tool that allows cluster users and operators to access cluster services and resources as if they were running in a local network. It also allows a developers to debug Kubernetes services locally and run local services as if they were on the cluster network. You can read more on how it works and what it can do in its documentation.
More …We’re making the use of encryption at rest our default. For this we switched our default PV storageclass on K8s from gp2
to gp2-encrypted
.
We’re making the use of encryption at rest our default and only option for OpenVPN storage. Until now it was enabled by default, but it was still possible to disable it.
More …We’ve added support for monitoring both self-hosted RabbitMQ and AmazonMQ for RabbitMQ clusters through Prometheus. The monitoring coverage differs slighly for those two services.
More …We have added initial support for Milvus. It is available as an optional add-on for all customers using our Kubernetes Reference Solution on AWS.
More …