Managing storage is a distinct problem from managing compute instances. The PersistentVolume subsystem provides an API for users and administrators that abstracts details of how storage is provided from how it is consumed.
What are Persistent Volumes (PVs)?
Persistent Volumes (PVs)
are a mechanism to provide persistent storage for applications running in the cluster. PVs represent a piece of network-attached storage that is provisioned by the cluster administrator.
A Persistent Volume
is a cluster-level resource that decouples storage from individual pods or applications. It has a lifecycle independent of any specific pod and can be dynamically provisioned or statically created.
PVs
have configurable properties such as capacity, access modes (e.g., ReadWriteOnce, ReadOnlyMany, ReadWriteMany), and storage class (a way to dynamically provision PVs). They can be dynamically provisioned by storage providers, or manually created and managed by the cluster administrator.
PVs
can be backed by various storage types, including local storage, network-attached storage (NAS), cloud storage, or other external storage systems. They provide a persistent and reliable storage solution for applications, allowing data to persist even if the associated pod is deleted or rescheduled.
What is Persistent Volume Claim (PVC)?
A PersistentVolumeClaim (PVC)
is a request for storage by a user. It is similar to a Pod. Pods consume node resources and PVCs consume PV resources. Pods
can request specific levels of resources (CPU and Memory) whereas Claims
can request specific size and access modes (e.g., they can be mounted ReadWriteOnce, ReadOnlyMany, ReadWriteMany, or ReadWriteOncePod).
When an application requires persistent storage, it creates a PVC with the desired characteristics. Kubernetes then matches the PVC with an appropriate PV based on the PVC’s specifications, such as storage capacity, access modes, and storage class.
A PVC
is essentially a request to mount a PV
meeting certain requirements on a pod. PVCs
do not specify a specific PV
—instead, they specify which StorageClass the pod requires. Administrators can define StorageClasses
that indicate properties of storage devices, such as performance, service levels, and back-end policies.
Persistent Volume Claims: Static vs. Dynamic Provisioning
To bind a pod to a PV, the pod must contain a volume mount and a PVC. These declarations allow users to mount PVs in pods without knowing the details of the underlying storage equipment.
There are two options for mounting PVs to a pod:
Static configuration
: involves administrators manually creating PVs and defining a StorageClass that matches the criteria of these PVs. When a pod uses a PVC that specifies the StorageClass, it gains access to one of these static PVs.Dynamic configuration
: occurs when there is no static PV that matches the PVC. In this case, the Kubernetes cluster provisions a new PV based on the StorageClass definitions.
Benefits of having PV and PVC in K8s
Improved High Availability
: By using PVs and PVCs, applications can achieve high availability and fault tolerance. PVs can be backed by replicated storage systems, ensuring data redundancy and availability.Abstraction and Portability
: PVs and PVCs abstract the underlying storage details from applications. This allows applications to request and use storage resources in a standardized and portable manner.Ease of Scaling and Migration
: With PVs and PVCs, applications can easily scale by dynamically provisioning additional PVs or increasing the capacity of existing PVs.Data Persistence
: PVs and PVCs ensure that data persists even if a pod or application is terminated or rescheduled. The decoupling of storage from pods allows the data to be retained in the PV, ensuring data durability and availability.Resource Management
: PVs and PVCs provide better resource management by separating storage from applications. Administrators can define storage classes and allocate appropriate storage resources to different PVCs based on capacity, access modes, and other criteria.Dynamic Provisioning
: PVCs support dynamic provisioning, where the cluster automatically creates PVs to fulfill the storage requirements specified in the PVC. This enables on-demand storage allocation and eliminates the need for manual PV creation.Flexibility and Customization
: PVCs allow applications to request storage resources with a specific capacity, and access modes (e.g., ReadWriteOnce, ReadOnlyMany, ReadWriteMany). This flexibility enables applications to choose the appropriate storage characteristics that align with their needs, ensuring optimal performance, security, and data access patterns.
Lifecycle of PV and PVC
In a Kubernetes cluster, a PV exists as a storage resource in the cluster. PVCs are requests for those resources and also act as claim checks to the resource. The interaction between PVs and PVCs follows this lifecycle: -
Provisioning
: the creation of the PV, either directly (static) or dynamically usingStorageClass
.Binding
: assigning the PV to the PVC.Using
: Pods use the volume through the PVC.Reclaiming
: the PV is reclaimed, either by keeping it for the next use or by deleting it directly from the cloud storage.Retain
: allows for manual reclamation of the resource.Delete
: removes the PV from Kubernetes.
A volume will be in one of the following states: -
Available
- this state shows that the PV is ready to be used by the PVC.Bound
- this state shows that the PV has been assigned to a PVC.Released
- the claim has been deleted, but the cluster has not yet reclaimed the resource.Failed
- this state shows that an error has occurred in the PV.
PVs Use Cases
Some of the use cases of PVs are: -
Database management
: Databases often require persistent storage to maintain data integrity and ensure high availability. By using a Persistent Volume, administrators can ensure that the data stored in the database remains persistent even if the corresponding pod goes offline or gets rescheduled.File Sharing and Collaboration Applications
: Applications like content management systems or file servers often require shared storage where multiple pods can read and write data simultaneously. Persistent Volumes provide the necessary functionality to meet these requirements.Machine Learning Applications
: Large datasets need to be stored and accessed by multiple pods simultaneously, PVs offer a scalable and efficient solution. Similarly, in IoT deployments, where sensor data needs to be collected and stored persistently, Persistent Volumes can ensure data reliability and availability.
Adding Persistent Volume
In this, we will add a persistent volume to our deployment todo app.
Step 1: We are taking todo-app. Refer to Deploy a sample todo app.
Step 2: Create a Persistent Volume using a file on your node.
vi pv.yml
# pv.yml
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-todo-app
spec:
capacity:
storage: 1Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
hostPath:
path: "/tmp/data"
Step 3: Now, apply the persistent volume.
kubectl apply -f pv.yml
Step 4: Create a Persistent Volume Claim that references the Persistent Volume.
vi pvc.yml
# pvc.yml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: pvc-todo-app
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 500Mi
Step 5: Now, apply the Persistent Volume Claim by using the below command.
kubectl apply -f pvc.yml
Step 6: Update your deployment.yml
file to include the Persistent Volume Claim. After Applying pv.yml
and pvc.yml
to your deployment file.
vi deployment.yml
#deployment.yml
apiVersion: apps/v1
kind: Deployment
metadata:
name: todo-app-deployment
spec:
replicas: 1
selector:
matchLabels:
app: todo-app
template:
metadata:
labels:
app: todo-app
spec:
containers:
- name: todo-app
image: mudit097/django-todo
ports:
- containerPort: 8000
volumeMounts:
- name: todo-app-data
mountPath: /app
volumes:
- name: todo-app-data
persistentVolumeClaim:
claimName: pvc-todo-app
Step 7: Apply the updated deployment using the command:
kubectl apply -f deployment.yml
Step 8: Verify that the Persistent Volume has been added to your Deployment. Use the below commands: -
# Getting details of Persistent Volume
kubectl get pv
# Getting details of Persistent Volume Claim
kubectl get pvc
# Getting the pvc details of the application
kubectl describe pvc pvc-todo-app
Hooray!! Persistent Volume has been added to our deployment application.🎉
Now, let's move with accessing the data in that volume.👇
Accessing data in the Persistent Volume
Step 1: Get the Pod details by using below command: -
# Getting the pods details
kubectl get pods
Step 2: Connect to a Pod in your Deployment using the command: -
# Get into one of the pod
kubectl exec -it <pod-name> /bin/bash
Step 3: Verify that you can access the data stored in the Persistent Volume from within the Pod by creating a file inside the /app
directory on the worker node using the below command: -
# Now change dir to /app
cd /app/
# Create a file test.txt and put some content into it
echo "Hello" > /app/test.txt
# Verify the file is present in the dir or not.
ls
Step 4: Now, delete that pod where we create a file. By deleting the deployment of pod and checking if the file, in the new pod is created after applying the deployment again in the server.
# Checking total pods
kubectl get pods
# Deleting the existing pod
kubectl delete pod <pod-name>
Step 5: As Auto healing
is there, a new pod is generated after the old pod is deleted.
Step 6: Now, verify the file you have created on the worker node: -
#Get the docker container name of newly created pod
docker ps
# Now get into the newly cretaed pod
docker exec -it <pod-name> /bin/bash
# Then change dir to /app
cd /app/
# List down all the file, and we can see the file we create before is present
ls
# Now Read the existing file. And we had the same file and file content is also same.
cat test.txt
Hooray!!! We have access data in the persistent volume..🎉
Conclusion
In Kubernetes (k8s), PV (Persistent Volume)
and PVC (Persistent Volume Claim)
are integral components for managing and persisting data in containerized applications.
PV (Persistent Volume)
: It represents a physical storage resource in the cluster, such as a disk, network storage, or other storage solutions. PVs can be pre-provisioned by administrators or dynamically provisioned by storage classes.PVC (Persistent Volume Claim)
: It is a request for storage by a user or a pod. PVCs abstract away the underlying storage details, allowing developers to request storage resources without worrying about the specific implementation.
Together, PVs
and PVCs
provide a scalable and flexible approach to managing storage in Kubernetes. PVs
decouple storage configuration from pod specifications, promoting a more modular and scalable architecture. PVCs
allow applications to consume storage resources dynamically based on their needs, facilitating efficient resource utilization.
So, In this blog basically we have seen how to :
Add persistent Volume to our deployment todo application, and
Access data in the Persistent Volume.
Hope you find it helpful🤞 So I encourage you to try this on your own and let me know in the comment section👇 about your learning experience.✨
👆The information presented above is based on my interpretation. Suggestions are always welcome.😊
~Smriti Sharma✌