SUSE Support

Here When You Need Us

rke2 helm-install job failing with INSTALLATION FAILED: cannot re-use a name that is still in use

This document (000021581) is provided subject to the disclaimer at the end of this document.

Environment

  • Rancher v2.7+
  • RKE2 v1.26+

Situation

During an upgrade of an RKE2 cluster, you may face issues related to helm-install job(s) upgrading internal components such as rke2-ingress-nginx or rke2-metrics-server.

By checking the related helm install Job pod logs you can see the following error message:

Error: INSTALLATION FAILED: cannot re-use a name that is still in use

This situation can occur as the result of a previously failed update to or removal of the component.

This KB describes how you can solve the above issue.

Resolution

The following commands (in order) should resolve the issue:
- `helm ls -A` to identify which rke2 deployed helm chart is not in a deployed state
- `helm -n kube-system history rke2-ingress-nginx` to view the release history for the affected chart (in this example the rke2-ingress-nginx chart in the kube-system Namespace).

NOTE: for affected charts the most recent revision will be in a non-deployed status, e.g. the output below where it indicates the chart is uninstalling and deletion is in progress. This example shows that revision number 5 is stuck and not deployed properly

REVISION	UPDATED                 	STATUS      	CHART                         	APP VERSION	DESCRIPTION                              
5       	Thu Oct  3 15:32:44 2024	uninstalling	rke2-ingress-nginx-4.10.401	1.10.4      	Deletion in progress (or silently failed)

- `kubectl get secrets -n kube-system | grep rke2-ingress-nginx`

NOTE: every X version will have a secret name that looks like: sh.helm.release.v1.rke2-ingress-nginx.vX
Following the example above, the name should be: sh.helm.release.v1.rke2-ingress-nginx.v5
Delete that secret

- `kubectl delete secrets -n kube-system sh.helm.release.v1.rke2-ingress-nginx.v5` To delete the affected helm release secret
- `kubectl delete pods -n kube-system helm-install-rke2-ingress-nginx-xxxxx` To delete the failed helm Job pod

The last command will delete the existing helm Job pod in an error state (with CrashLoopBackoff). After the pod deletion, a new Job pod will be scheduled and should run correctly (following the previous helm release secret deletion).

Cause

Helm deploys a version with a revision number for every component (e.g. rke2-ingress-nginx or rke2-metrics-server) in an RKE2 cluster in a certain namespace.
Whenever these components get upgraded, helm creates a new secret to indicate that a new release/version has been installed/rolled out.
Deleting the secret in question that reports the error message can unblock the situation and get the component upgrade to deploy successfully.

Disclaimer

This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.

  • Document ID:000021581
  • Creation Date: 09-Oct-2024
  • Modified Date:05-Nov-2024
    • SUSE Rancher

< Back to Support Search

For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com

tick icon

SUSE Support Forums

Get your questions answered by experienced Sys Ops or interact with other SUSE community experts.

tick icon

Support Resources

Learn how to get the most from the technical support you receive with your SUSE Subscription, Premium Support, Academic Program, or Partner Program.

tick icon

Open an Incident

Open an incident with SUSE Technical Support, manage your subscriptions, download patches, or manage user access.