Unable to scale up Rancher node pool nodes in downstream cluster due to mismatch in cluster name at VMware vCenter
This document (000021558) is provided subject to the disclaimer at the end of this document.
Environment
VMware node drivers
Situation
- After upgrading downstream cluster, we notice the below error for the cluster when we click the "Edit Config" option from Cluster Management.
pool1: The provided value for hostsystem was not found in the list of expected values. This can happen with clusters provisioned outside of Rancher or when options for the provider have changed pool2: The provided value for hostsystem was not found in the list of expected values. This can happen with clusters provisioned outside of Rancher or when options for the provider have changed pool3: The provided value for hostsystem was not found in the list of expected values. This can happen with clusters provisioned outside of Rancher or when options for the provider have changed
<img alt="image1" src="https://suse.file.force.com/sfc/servlet.shepherd/version/renditionDownload?rendition=ORIGINAL_Png&versionId=068Tr00000L4T4D&operationContext=CHATTER&contentId=05TTr00000SaiRY" />
- Also, when we try to scale up a node, we would be able to see in Rancher that it creates a machine resource and then immediately deletes it.
- However, checking from vSphere side, you would not notice any VM creation or even an API call for VM creation.
- Running the "kubectl get vmwarevspheremachine -n fleet-default" command will list all the machines of the VMware clusters managed by Rancher and you would be able to see the machine that got created recently as part of the scale up process.
- Thereafter, running the "kubectl describe" command for the newly created machine will show why it was not able to provision the VM on vSphere.
- We were able to see that the VM did not get created on vSphere because of the following error message:
Failed creating server [fleet-default/Prod-infra-xxxxxxxx-xhxxx] of kind (VmwarevsphereMachine) for machine Prod-infra-xxxxxxxx-xbxxx in infrastructure provider:
CreateError: Running pre-create checks... (Prod-infra-xxxxxxxx-xhxxx) Connecting to vSphere for pre-create checks... (Prod-infra-xxxxxxxx-xhxxx) Using datacenter /Kubernetes
(Prod-infra-xxxxxxxx-xhxxx) Using Network /Kubernetes/network/Datacenter-01 LB Kubernetes Error with pre-create check: "host '/Kubernetes/host/K8s Prod/xxvm22.xxxxx.xxxxxxx.xx' not found"
- Checking on vSphere, we could confirm that node ttvm26.xxxxx.xxxxxxx.xx existed.
- However, the ESXi cluster name in the error messages was "K8s Prod" but on vSphere it was changed to a different name by the vSphere admin that triggered this issue on the Rancher's downstream cluster.
Resolution
- The only available option is to change the name of the ESXi cluster on vSphere back to "K8s Prod" as seen in the error message since we do not have an option to change or select the updated ESXi cluster resource from Rancher side for the existing downstream cluster.
Disclaimer
This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.
- Document ID:000021558
- Creation Date: 11-Sep-2024
- Modified Date:20-Dec-2024
-
- SUSE Rancher
For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com