Azure AD - An error occurred logging in Server error while authenticating
This document (000021599) is provided subject to the disclaimer at the end of this document.
Environment
- Rancher versions: 2.9.1, 2.9.2, 2.9.3
- Auth provider: Azure AD
Situation
When upgrading to Rancher 2.9.1 or 2.9.2 and use Azure AD as your main auth provider to login, after a certain amount of time users will be unable to login and will receive the following message:
An error occurred logging in Server error while authenticating
The reason is because in the local Rancher cluster, there is a secret called `azuread-access-token` in the `cattle-global-data` namespace that appends user login information whenever a user logs in. Over time, the secret will grow in size till eventually reaching Kubernetes max secret size: 1MB or 1048576 bytes.
Note: The secret can reach over this limit, and when it does, that's when we start to see users not able to login to Rancher. To verify it's size you can run a couple of commands:kubectl get secret azuread-access-token -n <namespace> -o jsonpath="{.data}" | base64 -d | wc -c
orkubectl describe secret azuread-access-token -n cattle-global-data | grep bytes
Resolution
As a workaround, remove the azuread-access-token
in the cattle-global-data
namespace. Once deleted, verify that the secret is indeed deleted. The secret will get recreated when a user logs back into Rancher. And the size of the secret should decrease.
In the official patch, we changed the behavior to create a new client for every token authentication which doesn't use the cache. This patch will be included in Rancher 2.10 as well as backported to 2.9 more specifically in >=2.9.4:
- https://github.com/rancher/rancher/issues/47672 (For 2.10)
- https://github.com/rancher/rancher/issues/47688 (2.9 backport)
Cause
The cause is due to the azuread-access-token
filling up with user login information, till eventually hitting Kubernetes max limit. The Azure client login was using the access token cache, which led to additional tokens being cached.
Status
Additional Information
In the local Rancher cluster, more specifically the Rancher pods, here is an example of errors that you may encounter in the logs:
[ERROR] API error response 500 for POST /v3-public/azureADProviders/azuread?action=login. Cause: getting OID from AuthCode: error updating secret azuread-access-token: Secret "azuread-access-token" is invalid: data: Too long: must have at most 1048576 bytes
Disclaimer
This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.
- Document ID:000021599
- Creation Date: 24-Oct-2024
- Modified Date:07-Nov-2024
-
- SUSE Rancher
For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com