New Dex onboarding docs and support for Google Group scope

We are excited to announce that we have completely rewritten our Dex onboarding documentation. This documentation will guide you through the steps required to configure your Identity Provider (like Google, Microsoft or Github) so we can integrate it as authentication mechanism in your platform (eg. for accessing monitoring dashboards). In addition we have added support to inject extra files into the Dex configuration, which for example allows using a Google service account to limit access scopes to specific groups instead of a whole domain.

[ACTION REQUIRED] Upgraded cluster add-ons

The following updates have been rolled out to all non-production clusters. Notable updates include a major release for Prometheus, bringing a new UI. As usual there’s also improvements across various other add-ons, ensuring enhanced performance and security. Finally we’d also like to remind you again for Actions to take regarding the Grafana AngularJS deprecation!

More …

New beta feature: Kubecost

We are excited to announce that we have added Kubecost as an optional feature to our Platform. Kubecost is a tool that helps you to manage your Kubernetes costs by providing visibility into your AWS and Kubernetes resource usage and costs. At the moment we are shaping our FinOps offering and Kubecost is part of that. If you are interested in trying out Kubecost, or have questions regarding our FinOps offering please reach out to us through your Skyscrapers lead.

More …

Cleanup of Teleport

As mentioned in our previous post, we removed Teleport from our environments (apart from the ones that were not ready to move away from it). For customers that had nothing else running on the tools environment we also cleaned up the networking. This will also save some costs for those customers.

Migrated to Tailscale for internal remote access to our managed environments

In order to streamline our colleague’s experiences, we are excited to announce that we have moved to Tailscale for secure remote access to our managed environments. Tailscale is a Zero Trust network that provides a lightweight, seamless yet secure experience for connecting to all the different networks and services we manage. This replaces the use of Teleport and OpenVPN for internal Skyscrapers’ use. Next up, we plan to evaluate replacement options for our customers’ VPN offering in the coming months.

More …

Loki optimisations to mitigate recurring performance issues

Over the past months, we’ve gathered customer feedback and monitored Loki’s performance within our Kubernetes clusters. This process has highlighted recurring performance challenges. To address these, we are rolling out optimizations designed to enhance stability and performance. This post outlines the changes we’re making and the reasoning behind them. If you have any feedback or questions please don’t hesitate to reach out to us.

More …

Our newest version of the Skyscrapers Security Policy is publicly available

We are excited to announce a new version of our Security Policy to reflect the latest changes in our organization, continously improving our security practices. There are some significant changes, with emphasis on the introduction of the “Data Classification and Handling” and “Asset Management” policies. Furthermore, following these new policies, we have made this new version available via our public documentation website.

Upgraded Teleport to version 15.4.21

We’ve upgraded all Teleport clusters to 15.4.21. Teleport is a tool we mostly use internally to provide secure and auditted access to (EC2) instances, Kubernetes clusters and several dashboards. The nodes will gradually be upgraded to the new version when new instances are launched. You can find more information on this release in the Teleport changelog.

More …

New Grafana SRE dashboards

we’ve integrated open-source dashboards taking advantage of the newest features in Grafana to our setup. These dashboards are designed to help you monitor your services and infrastructure more effectively. They are available in your Grafana under the SRE section. Here is a quick overview of the new dashboards:

More …

Maintenance: OpenTofu upgraded to 1.8.3

We’re excited to announce that we are now using OpenTofu version 1.8.3 to deploy our environments. This release includes several bug fixes and new features that we can leverage going forward. No action is needed on your part, as the upgrade is part of our automation processes. Most of our codebases still allow using OpenTofu >= 1.6. If you encounter any issues or have any questions, please don’t hesitate to contact us.

More …

New feature: Dependabot scanning for Terragrunt/OpenTofu

We’re excited to announce that Dependabot has been integrated into our GitHub repositories to ensure our Terragrunt and OpenTofu modules stay up to date. Dependabot will automatically scan our repositories for outdated dependencies and generate pull requests to update them. This proactive approach helps us maintain the security and stability of our modules while keeping them current with the latest features and bug fixes. No action is needed on your part, Dependabot will handle the pull requests, and we’ll take care of reviewing, adjusting, and merging them.

[ACTION REQUIRED] Upgraded cluster add-ons

The following updates have been rolled out to non-production clusters, and will be pushed to production in the coming week. Notable updates include the major release of Karpenter v1 and improvements across various add-ons, ensuring enhanced performance and security. This is also a reminder to for Actions to take regarding the Grafana AngularJS deprecation!

More …

Upgrading Concourse CI to version 7.11.2

We’re happy to announce we’re finally upgrading to the latest version of Concourse CI, v7.11.2, which brings a lot of new features and improvements. Considering this is quite a substantial upgrade, we will get in touch with each customer individually in the coming weeks to start the upgrade process. This will also incur Concourse downtime, so we will work with you to find the best time to do this.

More …

Upgraded Teleport to version 15.4.16

We’ve upgraded all Teleport clusters to 15.4.16. Teleport is a tool we mostly use internally to provide secure and auditted access to (EC2) instances, Kubernetes clusters and several dashboards. The nodes will gradually be upgraded to the new version when new instances are launched. You can find more information on this release in the Teleport changelog.