Using Longhorn v1.3 CSI snapshots for backup and recovery with CloudCasa
Less than a year ago, the Cloud Native Computing Foundation (CNCF) Technical Oversight Committee (TOC) voted to accept Longhorn as a CNCF incubating project marking its exit from Sandbox project status. One of the criteria for doing so is achieving sufficient momentum and maturity. Some evidence of that is innovation and adoption outside of the project’s original proponents — an expanding and thriving community. Longhorn has come a long way and SUSE One gold partner, Catalogic, has been participating in the Longhorn community to provide advanced, workload-centric backup and recovery options for Longhorn.
We’ve invited Catalogic for a guest blog so you can learn more about Longhorn, CloudCasa™ by Catalogic and some of the innovations happening in the Longhorn community. ~ Bret
SUSE guest blog authored by:
Sathya Sankaran, COO of Catalogic and GM of CloudCasa
With the release of Longhorn v1.3.0, CloudCasa by Catalogic is happy to announce that it fully supports the backup and recovery of Longhorn persistent volumes (PVs) on Kubernetes clusters. While previous versions of Longhorn supported volume snapshots and the CSI interface, Longhorn v1.3 introduced full support for the CSI snapshot interface so it can now be used to trigger volume snapshots in a cluster.
CloudCasa makes use of this welcome new functionality for backup and recovery of clusters with Longhorn PVs, using either local snapshots or snapshots plus copies to remote storage. Backups can easily be managed across many clusters, and advanced recovery use cases for restores can be performed to the same cluster, across clusters, or even across cloud accounts, regions, and cloud providers.
What is Longhorn?
Longhorn is a lightweight, reliable and easy-to-use persistent block storage system for Kubernetes. Originally developed by Rancher Labs (now SUSE) and now developed as an incubating project of the CNCF.
Longhorn can seamlessly convert a large block of storage into thousands of volumes distributed as PVs – effectively delivering storage as a microservice. Replicas, snapshots and backups are some of the core functionalities that have long been ably supported by Longhorn.
Longhorn v1.3 supports two types of data protection:
- Snapshots: Snapshots are stored locally, as a part of each replica of a volume. They are stored on the disk of the nodes within the Kubernetes cluster.
- Backups: Backups are objects stored in the backup store (BackupStore), which is an NFS or S3 compatible object store external to the Kubernetes cluster.
What is the Container Storage Interface (CSI)?
The Container Storage Interface (CSI) is a specification for exposing block and file storage systems to enable easy interoperability between container orchestrators like Kubernetes and storage providers. It supports many core storage functionalities such as provisioning, mounting, snapshots, and cloning.
Longhorn has supported CSI snapshots since 2020. However, prior to Longhorn v1.3, the Longhorn CSI driver only supported volume backups to a target outside of a cluster as part of its Kubernetes VolumeSnapshot implementation, despite the Longhorn UI allowing both volume backup and snapshot options to be executed. That is, CSI Snapshotter requests actually invoked the backup workflow for the volume. So this behavior required that a user manually configure an out of cluster BackupStore in order to invoke it.
What changed in Longhorn v1.3 CSI snapshots?
As described in this Longhorn GitHub issue, the new behavior in Longhorn v1.3 allows in-cluster snapshots to be created through the CSI API. Longhorn v1.3 introduced a type parameter that allows you to request either a backup or a snapshot when a CSI snapshot is triggered, as shown below.
kind: VolumeSnapshotClass
apiVersion: snapshot.storage.k8s.io/v1beta1
metadata:
name: longhorn-backup-2
driver: driver.longhorn.io
deletionPolicy: Delete
parameters:
type: bak
kind: VolumeSnapshotClass
apiVersion: snapshot.storage.k8s.io/v1beta1
metadata:
name: longhorn-snapshot
driver: driver.longhorn.io
deletionPolicy: Delete
parameters:
type: snap
This change in CSI snapshot behavior is important for the following reasons:
- Consistency: As discussed above, the Longhorn UI already supported both backup and snapshot options. Most of the popular persistent storage systems behave similarly to the new implementation.
- Ease of Use: Previously, users had to configure an out of cluster BackupStore, even for taking a snapshot. Now, there is no such need, given snapshots are stored in-cluster.
- Compatibility: The most common side effects of the old behavior was that it broke third-party backup solutions that use snapshots. In CloudCasa, for example, PV backups often failed with a time-out, since snapshots weren’t expected to run for hours, and the jobs would often end up as partially successful.
Now purpose-built backup solutions can manage the snapshots and backup copies, setting retention policies for compliance and immutability for ransomware protection, tampering or accidental deletion.
Community participation with Longhorn team
At the KubeCon NA 2021 conference in Los Angles last October, we engaged with the Longhorn team about the need to support in-cluster CSI snapshots. We were happy to provide some external testing, which was well received by the Longhorn engineering team. Using CloudCasa, we verified this functionality in the master branch as well as in recent Longhorn release candidates. We at Catalogic congratulate the Longhorn maintainers on the release of v1.3. and we thank the community for the work that went into it.
What is CloudCasa?
CloudCasa by Catalogic is a cyber-resilient backup service to protect Kubernetes workloads. CloudCasa integrates natively with all flavors of Kubernetes as well as Kubernetes management platforms like SUSE Rancher, and managed Kubernetes services such as Azure Kubernetes Services (AKS) and Amazon Elastic Kubernetes Service (EKS). CloudCasa relies on CSI-compliant storage platforms like Longhorn to take and manage snapshots to back up and restore Kubernetes Persistent Volumes (PVs) from recovery points.
Premium service plans provide PV backups along with cluster and cloud metadata backups, to ensure your data is safe and protected with unlimited retention times, and immutable recovery points. The saving of resource data and metadata enables advanced migration and recovery use cases, allowing organizations to easily restore data across clusters, regions, cloud accounts and cloud providers. This is important for disaster recovery scenarios, for cluster migration, and for providing replicas for Dev/Test environments.
The free service plan for CloudCasa has no limits on the number of snapshots managed, worker nodes, or clusters supported, and it provides up to 30 days of local PV snapshot retention and cluster resource data backups on secure, encrypted S3 storage. The premium plans are priced based on the amount of data you protect, not on the number of clusters you have or the number of worker nodes running.
How does CloudCasa backup Longhorn PVs
Let’s start with a Rancher cluster already configured with Longhorn v1.3. In a previous blog, we covered the process for installing the CloudCasa agent from the SUSE Rancher Apps & Marketplace. The Helm chart for CloudCasa orchestrates installation of the CloudCasa backup agent containers on Rancher managed clusters and connects to the CloudCasa data protection service to register the clusters.
The screenshot below shows a registered cluster, “Longhorn1.3Cluster,”, which has a namespace “longhornworkload” with a PV provisioned by Longhorn v1.3.
Next, we add a backup job through the CloudCasa UI, where you can choose to back up either the full cluster or specific namespaces and/or resources tagged with specific labels. In the screenshots below, we added a backup job and selected Full Cluster and all PVs attached to it. To demonstrate the new CSI snapshot process, we selected “Snapshot only” for the job.
The activity details of the backup job can be viewed in real time, and the PV details of the job once it has been completed.
Now that we have a successful backup, we can delete the namespace and restore it back to the same cluster to show a selective restore of a Longhorn v1.3 PV snapshot that is stored in-cluster. The restore job completes in almost the same time it took for us to back up.
In Summary and Your Next Steps
With CloudCasa, you can now leverage Longhorn v1.3 snapshots as your data protection method in CloudCasa very easily. You can implement backups of Longhorn data to any S3 storage, either self-managed or managed by CloudCasa. These snapshots can also be restored to alternate clusters or cloud providers, as well as mapped to different storage classes via the advanced cluster restore capabilities of CloudCasa. But that process didn’t change with this new version of Longhorn, so we will leave that for another day and another blog.
Until then, feel free to create a free account at https://cloudcasa.io/signup and start taking and managing snapshots to protect your clusters. We are confident that you’ll be done before your next coffee run.
Learn more about how CloudCasa and SUSE can help you solve Kubernetes management and data management challenges by visiting our website for more information, getting in touch with the CloudCasa team, or contacting your SUSE Rancher representative. We look forward to talking to you!
Sathya is the COO of Catalogic Software and the founder and General Manager of the CloudCasa business within Catalogic Software, where he provides operational and strategic oversight across R&D, marketing, sales and partner alliances. Sathya was an early enthusiast of the potential for containers and cloud technologies to transform how we innovate and deliver solutions to businesses. He is responsible for Catalogic’s strategic pivot to focus on addressing Day 2 challenges in Kubernetes and cloud native ecosystems, including data protection, cyber-resilience and cloud mobility. As the COO of Catalogic Software, Sathya leads engineering, sales and alliance teams at Catalogic.
Related Articles
May 09th, 2023