Kubernetes Pods Disruption Budget
Introduction⌗
What is Pod Disruption Budget?⌗
Pod disruption budget (PDB) is a resource policy that ensures high availability of the application through defining the number of pods that should always be available in a workload. Given a number of replica sets for an application, PDB will define the minimum and the maximum number of available pods.
What disruptions can cause a change to the number of pods in a node?⌗
A disruption is an event in the cluster that can cause a pod to terminate and be unvailable.
There are two types of disruptions that can cause this:
- Voluntary disruptions
- Involuntary disruptions
Voluntary disruptions⌗
These are disruptions that are triggered by the application administrator. For example:
- Deleting a pod by a mistake.
- Deleting a deployment by mistake etc
Involuntary disruptions⌗
These are common events that you anticipate but cannot avoid:
- Kernel panics
- Hardware failure
- Node evicting a pod because the node is out of resources
- Cloud provider deleting a VM etc
Fields of a PDB⌗
.spec.selector
which shows which applications are affected by this policy..spec.minAvailable
describes the minimum number of pods that will be available after an eviction is done in a node. It can be an absolute number or a percentage..spec.maxUnavailable
describes the maximum number of pods that will unavailable after an eviction is done in a node. It can be an absolute number or a percentage.
Example of a PDB⌗
Minimum available pods⌗
In this description, we set the minimum available pods to 2.
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: test-pdb
spec:
minAvailable: 2
selector:
matchLabels:
app: testapp
Maximum unavailable pods⌗
In this description, we set the maximum unavailable pods to 1.
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: test-pdb
spec:
maxUnavailable: 1
selector:
matchLabels:
app: testapp
Example to show PDB at work⌗
You have a cluster running 3 nodes. On the nodes, you are running an application with 3 replica sets that are spread out of onto the 3 nodes. The 3 pods have a PDB policy in place. You have another pod, called pod-other
in this example that is not part of the PDB policy but in node-1
.
node-1 | node-2 | node-3 |
---|---|---|
pod-1: available | pod-2: available | pod-3: available |
pod-other: available |
The pod 1-3 are in a deployment with a PDB policy that requires 2 of the 3 pods to be available:
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: pdb-min
spec:
minAvailable: 2
selector:
matchLabels:
app: testapp
Deployment definition
apiVersion: apps/v1
kind: Deployment
metadata:
name: testapp
spec:
replicas: 3
selector:
matchLabels:
app: testapp
template:
metadata:
labels:
app: testapp
spec:
containers:
- name: hello-world
image: hello-world
Assuming the cluster admin needs to perform a kernel update on the cluster. The admin will first try to drain
the first node:
kubectl drain node-1
This will succeed immediately, setting the pod-1
and pod-other
into a terminating
state.
node-1 | node-2 | node-3 |
---|---|---|
pod-1: terminating | pod-2: available | pod-3: available |
pod-other: terminating |
The deployment will notice that there are only 2 pods running and it will create another pod that will replace the pod-1
. Let’s call it pod-4
. It will also create another pod, let’s call it pod-y
to replace the terminating pod-other
Now, the cluster state will look like this:
node-1 | node-2 | node-3 |
---|---|---|
pod-1: terminating | pod-2: available | pod-3: available |
pod-other: terminating | pod-4: starting | pod-y: starting |
When the pods finish terminating, the cluster looks like this:
node-1 | node-2 | node-3 |
---|---|---|
pod-2: available | pod-3: available | |
pod-4: starting | pod-y: starting |
If the admin tries to drain node-2
, it will fail. This is because, the policy requires that at least 2 nodes are available. Eventually, pod-4
is available and the cluster looks like this:
node-1 | node-2 | node-3 |
---|---|---|
pod-2: available | pod-3: available | |
pod-4: available | pod-y: available |
The admin will try again to drain node-2
. If we assume an order of stopping the pods as pod-2
and then pod-4
, the drain will succeed to terminate pod-2
but it will be stopped from terminating pod-4
. This is because, the policy only allows a minimum of 2 pods to be available.
node-1 | node-2 | node-3 | no-node |
---|---|---|---|
pod-2: terminating | pod-3: available | pod-5: pending | |
pod-4: available | pod-y: available |
At this point, the admin needs to add another node to the cluster to continue the upgrade.
Conclusion⌗
PDB will enable your application to have high availability by ensuring that there are always pods running during cluster maintainance.
You can comment on this article by clicking on this link.