Skip to content

Etcd Cluster Components

For every Etcd cluster that is provisioned by etcd-druid it deploys a set of resources. Following sections provides information and code reference to each such resource.

StatefulSet

StatefulSet is the primary kubernetes resource that gets provisioned for an etcd cluster.

  • Replicas for the StatefulSet are derived from Etcd.Spec.Replicas in the custom resource.

  • Each pod comprises of two containers:

  • etcd-wrapper : This is the main container which runs an etcd process.

  • etcd-backup-restore : This is a side-container which does the following:

    • Orchestrates the initialization of etcd. This includes validation of any existing etcd data directory, restoration in case of corrupt etcd data directory files for a single-member etcd cluster.
    • Periodically renewes member lease.
    • Optionally takes schedule and thresold based delta and full snapshots and pushes them to a configured object store.
    • Orchestrates scheduled etcd-db defragmentation.

    NOTE: This is not a complete list of functionalities offered out of etcd-backup-restore.

Code reference: StatefulSet-Component

For detailed information on each container you can visit etcd-wrapper and etcd-backup-restore respositories.

ConfigMap

Every etcd member requires configuration with which it must be started. etcd-druid creates a ConfigMap which gets mounted onto the etcd-backup-restore container. etcd-backup-restore container will modify the etcd configuration and serve it to the etcd-wrapper container upon request.

Code reference: ConfigMap-Component

PodDisruptionBudget

An etcd cluster requires quorum for all write operations. Clients can additionally configure quorum based reads as well to ensure linearizable reads (kube-apiserver's etcd client is configured for linearizable reads and writes). In a cluster of size 3, only 1 member failure is tolerated. Failure tolerance for an etcd cluster with replicas n is computed as (n-1)/2.

To ensure that etcd pods are not evicted more than its failure tolerance, etcd-druid creates a PodDisruptionBudget.

Note

For a single node etcd cluster a PodDisruptionBudget will be created, however pdb.spec.minavailable is set to 0 effectively disabling it.

Code reference: PodDisruptionBudget-Component

ServiceAccount

etch-backup-restore container running as a side-car in every etcd-member, requires permissions to access resources like Lease, StatefulSet etc. A dedicated ServiceAccount is created per Etcd cluster for this purpose.

Code reference: ServiceAccount-Component

Role & RoleBinding

etch-backup-restore container running as a side-car in every etcd-member, requires permissions to access resources like Lease, StatefulSet etc. A dedicated Role and RoleBinding is created and linked to the ServiceAccount created per Etcd cluster.

Code reference: Role-Component & RoleBinding-Component

Client & Peer Service

To enable clients to connect to an etcd cluster a ClusterIP Client Service is created. To enable etcd members to talk to each other(for discovery, leader-election, raft consensus etc.) etcd-druid also creates a Headless Service.

Code reference: Client-Service-Component & Peer-Service-Component

Member Lease

Every member in an Etcd cluster has a dedicated Lease that gets created which signifies that the member is alive. It is the responsibility of the etcd-backup-store side-car container to periodically renew the lease.

Note

Today the lease object is also used to indicate the member-ID and the role of the member in an etcd cluster. Possible roles are Leader, Member(which denotes that this is a member but not a leader). This will change in the future with EtcdMember resource.

Code reference: Member-Lease-Component

Delta & Full Snapshot Leases

One of the responsibilities of etcd-backup-restore container is to take periodic or threshold based snapshots (delta and full) of the etcd DB. Today etcd-backup-restore communicates the end-revision of the latest full/delta snapshots to etcd-druid operator via leases.

etcd-druid creates two Lease resources one for delta and another for full snapshot. This information is used by the operator to trigger snapshot-compaction jobs. Snapshot leases are also used to derive the health of backups which gets updated in the Status subresource of every Etcd resource.

In future these leases will be replaced by EtcdMember resource.

Code reference: Snapshot-Lease-Component