SUSE Support

Here When You Need Us

Failed ETCD snapshot restoration leads the cluster into stuck "paused" state

This document (000021399) is provided subject to the disclaimer at the end of this document.

Environment

Rancher Server 2.7.6 and above

Situation

In some cases, the downstream cluster can get into a broken state which requires a Disaster Recovery process to bring it back to its active state.
At some point, the DR process does not finish properly and hangs up indefinitely which leads the cluster into what is called a "paused" state.
This symptom can be seen by checking the clusters.cluster.x-k8s.io object in the fleet-default namespace from the local (upstream) cluster.
kubectl get clusters.cluster.x-k8s.io <CLUSTER_NAME> -n fleet-default -o yaml
In the yaml output, you should see the .spec.paused field being set to true.

 

Resolution

To unblock this situation, the following steps are recommended to perform:
- edit the clusters.cluster.x-k8s.io object in the fleet-default namespace from the local (upstream) cluster
kubectl edit clusters.cluster.x-k8s.io <CLUSTER_NAME> -n fleet-default -o yaml
- refer to the .spec.paused field being set to false
- save the file and exit

The above steps will instruct Rancher to unpause the cluster or unblock the stuck situation to continue doing the restore process.
The recommended approach would be performing the DR process again after the edit is made.
Right after this, please refer to Rancher Manager backup and restore docs here to continue the DR process depending on the distribution in use (RKE/RKE2/K3S).
 

Cause

an unforeseen incident (network, OS failure etc...) led the cluster into a broken state.
an outage that made all Control Plane nodes completely unavailable.

Disclaimer

This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.

  • Document ID:000021399
  • Creation Date: 12-Mar-2024
  • Modified Date:25-Jun-2024
    • SUSE Rancher

< Back to Support Search

For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com

SUSE Support Forums

Get your questions answered by experienced Sys Ops or interact with other SUSE community experts.

Support Resources

Learn how to get the most from the technical support you receive with your SUSE Subscription, Premium Support, Academic Program, or Partner Program.

Open an Incident

Open an incident with SUSE Technical Support, manage your subscriptions, download patches, or manage user access.