Node Replacement
Purpose: For operators, shows how to drain, remove, and re-provision failed control plane or worker nodes.
Prerequisites
kubectlaccess to the cluster- SSH access to cluster nodes (via bastion or direct)
- Kubespray inventory in
infrastructure/clusters/<cluster>/inventory/ - For control plane nodes: at least 3 control plane nodes (replacing one at a time maintains quorum)
Step 1: Identify the Failed Node
# Check node status
kubectl get nodes -o wide
# Look for NotReady nodes
kubectl get nodes | grep NotReady
# Check node conditions for details
kubectl describe node <node-name> | grep -A5 Conditions
Step 2: Cordon and Drain
Cordoning prevents new pods from being scheduled. Draining evicts existing workloads.
# Cordon the node (no new scheduling)
kubectl cordon <node-name>
# Drain the node (evict workloads, respect PDBs)
kubectl drain <node-name> \
--ignore-daemonsets \
--delete-emptydir-data \
--timeout=300s
If the node is completely unreachable, drain with --force:
kubectl drain <node-name> \
--ignore-daemonsets \
--delete-emptydir-data \
--force \
--timeout=120s
Step 3: Remove the Node from Kubernetes
# Delete the node object from the cluster
kubectl delete node <node-name>
For control plane nodes, also remove the etcd member:
# List etcd members (run on a healthy control plane node)
ETCDCTL_API=3 etcdctl member list \
--endpoints=https://127.0.0.1:2379 \
--cacert=/etc/ssl/etcd/ssl/ca.pem \
--cert=/etc/ssl/etcd/ssl/node-$(hostname).pem \
--key=/etc/ssl/etcd/ssl/node-$(hostname)-key.pem
# Remove the failed member by ID
ETCDCTL_API=3 etcdctl member remove <member-id> \
--endpoints=https://127.0.0.1:2379 \
--cacert=/etc/ssl/etcd/ssl/ca.pem \
--cert=/etc/ssl/etcd/ssl/node-$(hostname).pem \
--key=/etc/ssl/etcd/ssl/node-$(hostname)-key.pem
Step 4: Provision a Replacement VM
Create a new VM through Terraform or your infrastructure provider. Update the Kubespray inventory to replace the old node entry with the new one:
# infrastructure/clusters/<cluster>/inventory/inventory.yaml
all:
hosts:
worker-04: # Replacement node
ansible_host: 192.168.12.27
ip: 192.168.12.27
children:
kube_node:
hosts:
worker-04: {}
Remove the old node entry and commit the inventory change via PR.
Step 5: Run Kubespray to Join the New Node
cd infrastructure/clusters/<cluster>/inventory/
# Add the new node using Kubespray's scale playbook
ansible-playbook -i inventory.yaml \
-b --become-user=root \
scale.yml \
--limit=worker-04
For control plane replacements, use the full cluster.yml playbook instead of scale.yml.
Step 6: Verify
# Confirm the new node is Ready
kubectl get nodes -o wide
# Check pods are scheduling on the new node
kubectl get pods -A -o wide | grep worker-04
# Verify etcd health (for control plane replacements)
ETCDCTL_API=3 etcdctl endpoint health \
--endpoints=https://127.0.0.1:2379 \
--cacert=/etc/ssl/etcd/ssl/ca.pem \
--cert=/etc/ssl/etcd/ssl/node-$(hostname).pem \
--key=/etc/ssl/etcd/ssl/node-$(hostname)-key.pem
Troubleshooting
- Drain hangs — A PodDisruptionBudget is blocking eviction. Check
kubectl get pdb -Aand assess whether it is safe to use--force. - New node fails to join — Verify SSH connectivity from the Kubespray runner to the new node. Check that the join token has not expired.
- etcd quorum lost — If two of three control plane nodes fail simultaneously, etcd loses quorum. Restore from etcd backup (see Backup & Restore).
- Persistent volumes on failed node — Longhorn replicates data across nodes. If the failed node held the only replica, data recovery depends on the storage backend.