We’ve upgraded all the k8s cluster with a new etcd backup implementation. The old backup solution was relying on daily snapshots taken from a service running in the master nodes.
We’ve decided to take a new approach by using AWS Data Lifecycle Manager to take daily snapshots of the 6 etcd EBS volumes (2 per instance, 3 master nodes).
This new solution guarantees a higher reliability and efficiency of the backups.
We’ve successfully tested and documented the restore procedure in order to make a possible disaster recovery quick and easy.
The default backup retention period we configured is of 14 days.