In the face of DockerHub’s rate limits and the need for more integrated automation, the decision was made to migrate our Docker images from DockerHub to GitHub Container Registry (GHCR). This transition not only avoids DockerHub’s constraints but also leverages GitHub Actions for seamless build automation.
More …
We are rolling out EKS v1.30
. Please make sure to update to our recommended client versions matching this upgrade.
More …
We’ve upgraded several components, most notably Karpenter, mid cycle to fix several bugs. This resolves the AWS notice regarding a defect in Karpenter’s AMI drift detection logic, which could lead to unexpected, unneeded, node replacements. We have rolled out the update to all our managed environments.
More …
We are excited to announce that we have configured the CloudWatch integration in Grafana by default. This allows you to query and visualize your AWS CloudWatch metrics in Grafana. This is especially useful if you are already using Grafana for monitoring and want to have all your metrics in one place.
More …
We are excited to introduce Flux, a powerful GitOps tool for Kubernetes, to our platform. It is designed to keep your Kubernetes clusters in sync based on the configuration in git and to automate updates to configuration when Flux detects it. If you think this could be useful for your team, get in touch with us so we can enable it on your cluster(s), and offer you guidance and training on how to leverage it for your use-case.
More …
We’ve upgraded all Teleport clusters to 15.3.7
. Teleport is a tool we mostly use internally to provide secure and auditted access to (EC2) instances, Kubernetes clusters and several dashboards. The nodes will gradually be upgraded to the new version when new instances are launched. You can find more information on this release in the Teleport changelog. In particular, several high-level security fixes were done in 15.3.6
As part of our regular upgrade cycle, the following Kubernetes cluster components have been updated and have been rolled out to all our managed clusters. Highlight of this update is Loki v3.0, which brings a lot of new features and performance improvements, one of which is the enabling of query results and chunk caching. If you haven’t done yet, please also make sure to verify the actions to take from our previous changelog regarding Grafana deprecations!
More …
In the ever-evolving landscape of infrastructure as code (IaC), staying adaptable and proactive is crucial. Our latest initiative involves transitioning from Terraform to OpenTofu, driven primarily by the recent licensing changes introduced in Terraform version 1.6.0 and beyond. Today we are happy to announce we have fully migrated to OpenTofu with version 1.6.2.
More …
While we’re migrating customers to the new GitHub maintained gha-runner-scale-set
controller, we’ve been noticing several reliability issues with this version. Problems like jobs not being able to schedule or runners being deleted early mid-run. We’ve rolled out a couple of changes to improve the reliability in these situations.
We’ve upgraded all Teleport clusters to 15.2.2
. Teleport is a tool we mostly use internally to provide secure and auditted access to (EC2) instances, Kubernetes clusters and several dashboards. The nodes will gradually be upgraded to the new version when new instances are launched. You can find more information on this release in the Teleport changelog.
Update 2024-04-22: These changes have been rolled out to all clusters.
More …
Update 2024-04-04: These changes have been rolled out to all clusters.
More …
Update 2024-04-04: All clusters have been upgraded to v1.29
.
More …
Update 2024-03-21: This change is applied on all clusters.
More …
As of today we are supporting the new (officially by GitHub supported) deployment method of GitHub Actions runners, the gha-runner-scale-set-controller
. This new controller is a more efficient and scalable way to deploy self-hosted GitHub runners (controlled by a new gha-runner-scale-set
) on Kubernetes. Next to improved stability and ongoing development, this new controller adds autoscaling of the runner pool based on the number of pending jobs, resulting in a more scalable and cost-effective solution.
More …
Update 2024-03-07: Upgrades have been applied on all clusters.
More …
We’ve upgraded all Teleport clusters from version 14.0.1
to 15.0.1
. Teleport is a tool we mostly use internally to provide secure and auditted access to (EC2) instances, Kubernetes clusters and several dashboards. The nodes will gradually be upgraded to the new version when new instances are launched.
More …
As of now as an experimental feature, we added support for Thanos to enable multi-cluster Prometheus monitoring. This enables you to store your Prometheus metrics in a central place, which can be used to query, visualize metrics and write alerts based on data of multiple environments.
More …
We have stopped building our own custom EKS AMI. As of now we directly rely on the upstream, AWS-provided image for EKS.
More …
Update 2024-01-25: All changes have been rolled out.
More …