Skyscrapers Changelog

Move to CoreDNS dns server and add gp2-encrypted StorageClass

15 Jan 2019 • Kubernetes

We’re updating our Kubernetes staging clusters with CoreDNS, the new dns server that replaces KubeDNS. After an in-depth analysis and tests we’ve verified that the performance and the stability between the two solutions are almost identical. Here you can find more details on why we decided to move to CoreDNS.

More …

Upgrade Vault to 1.0.1

11 Jan 2019 • General

A Vault upgrade for our setups was long overdue. We’ve upgraded our Vault installation tools from version 0.9.3 to 1.0.1, which is the latest Vault version available at the moment. As Vault is set up as HA, the downtime of the upgrade will be minimal, normally between half a second and a couple of seconds, which is the time the fail-over takes. The upgrade procedure to achieve that minimal downtime is the following:

More …

Upgrade to Kubernetes 1.11.6 [updated]

03 Jan 2019 • Kubernetes

Update: Changed Kubernetes update from 1.10.12 to 1.11.6

More …

Upgrade to Kubernetes 1.10.11 [updated]

03 Dec 2018 • Kubernetes

Update 2 (2018-12-03): Since our last update, the people at Kubernetes updated their documentation to add an important fix in the 1.10.11 changelog:

More …

Adding Prometheus monitoring for ECS

27 Nov 2018 • ECS

We’ve deployed on all our ECS managed staging clusters a prometheus monitoring system.

More …

Set resource reservations for kubelet and other system processes

27 Nov 2018 • Kubernetes

Following our efforts to improve the overall stability of our Kubernetes clusters, we’ve now set resource reservations for kubelet and other system processes. This will ensure that these critical processes always have enough CPU and memory available to function properly, regardless of what the actual cluster workloads are.

More …

Updated Prometheus & Grafana monitoring stack - update

21 Nov 2018 • Kubernetes

As announced in our previous update, we have migrated our cluster-monitoring stack by using the new stable/prometheus-operator as base chart. By now these updates have already been rolled out across staging clusters.

More …

Updated Prometheus & Grafana monitoring stack

19 Nov 2018 • Kubernetes

Our cluster monitoring stack is based on the prometheus-operator developed by the people at CoreOS, more concretely we used kube-prometheus as a starting point for a complete setup.

More …

Moving from kube-lego to cert-manager for automatic TLS certificates

13 Nov 2018 • Kubernetes

We’re moving the Letsencrypt service on our Kubernetes from the deprecated kube-lego to cert-manager.

More …

Set resource requests and limits for all infrastructure pods

13 Nov 2018 • Kubernetes

We’ve recently adjusted resource requests and limits for all Pods running in the infrastructure namespace. Previously, some of them didn’t have requests nor limits, and some others had unnecessary high values. We’ve reviewed the CPU and memory usage of those Pods for the last couple of weeks and we’ve adjusted their requests and limits accordingly. This is now rolled out to all staging clusters, and we’ll proceed with the production clusters next week if no issues are spotted.

More …

Grafana Pods dashboard updated memory metrics

05 Nov 2018 • Kubernetes

We’ve updated the Pods dashboard so it displays both the actual container memory usage (container_memory_working_set_bytes) next to the previous metric including caches (container_memory_usage_bytes). You can find this dashboard in your grafana deployment as Pods v2.

More …

Releasing our user-level documentation repository

30 Oct 2018 • General

Today we’re releasing a new user-level knowledge base of our products and services. It’s aimed to help you be more confident and autonomous in managing your applications on our platforms. You can find it in the following GitHub repository: https://github.com/skyscrapers/documentation

More …

K8S upgrade to stretch

23 Oct 2018 • General

We upgraded and tested our test cluster successfully to Debian stretch now that all open issues are resolved.

More …

Teleport upgrade to 2.7.5

28 Sep 2018 • General

Teleport has been upgraded to version 2.7.5 for all users. This upgrade includes various bugfixes and performance improvements, as well as additional functionality such as scp (secure copy) from the web interface.

More …

Kubernetes cluster-autoscaler enabled

28 Sep 2018 • Kubernetes

Today we release the addition of the Kubernetes Cluster AutoScaler to our clusters. Since we’ll be enabling the autoscaler by default, we’ll be initially deploying it on staging while production clusters will follow in a couple of days.

More …

Vault data is now backed up

28 Sep 2018 • General

Our Vault setup is configured to store the data in a DynamoDB table, using Vault DynamoDB storage backend. DynamoDB already replicates all the data in a table across three availability zones, giving Vault high availability and data durability. From today, we’re also enabling point-in-time recovery for the DynamoDB table, which provides continuous backups of the data for the last 35 days. This will give you the possibility to restore your Vault data in case it gets deleted or corrupted by accident, or you just want to go to a previous state.

More …

Kubernetes Infrastructure Tools Upgraded

25 Sep 2018 • Kubernetes

We have updated our internals infrastructure tools to the latest version. These upgrades add bugfixes and several new features.

More …

Reduced number of NAT gateways

25 Sep 2018 • Kubernetes

We’ve reduced the number of NAT gateways per VPC. In the previous setup we created one NAT gateway per VPC where we routed all the non-k8s traffic, and we had three NAT gateways just for the k8s cluster (one for each Availability Zone). In total we ended up having 4 NAT gateways per environment, plus one for the tools stack, so a total of 9.

More …

Upgrade logging to Kibana and Elasticsearch 6.3

20 Sep 2018 • Kubernetes

We use Elasticsearch with Kibana to aggregate logs from Kubernetes and our customers’ applications. Today this stack got upgraded to 6.3, bringing several improvements and bug fixes.

More …

Concourse version 4.2.1 upgrade

19 Sep 2018 • General

We’ve upgraded all the Concourse setups to the latest available version, that’s 4.2.1. As you might have read in the previous post from concourse we were working on the upgrade to Concourse 4.1. During that process we ran into some bugs that we had to mitigate. Early this week Concourse released v4.2 and a day later 4.2.1. We decided to immediately upgrade our test cluster to this version and when everything was stable also upgrade our customer clusters.

More …