Manual Rotation of Certificates in Rancher Kubernetes Clusters
Introduction
This blog will cover the certificates rotation issue for RKE clusters. Rancher also can be deployed on RKE2 or K3s clusters. The Rancher UI offers the provisioning of RKE2/K3s/AKS/EKS/GKE, not only RKE.
Kubernetes clusters use multiple certificates to provide both encryption of traffic to the Kubernetes components as well as authentication of these requests. These certificates are auto-generated for clusters launched by Rancher and also clusters launched by the Rancher Kubernetes Engine (RKE) CLI.
In Rancher, the auto-generated certificates for Rancher-launched Kubernetes clusters have a validity period of one year, meaning these certificates will expire one year after the cluster is provisioned. The same applies to Kubernetes clusters provisioned by v0.1.x of the Rancher Kubernetes Engine (RKE) CLI.
If you created a Rancher-launched or RKE-provisioned Kubernetes cluster about 1 year ago, you need to rotate the certificates. If no action is taken, then when the certificates expire, the cluster will go into an error state and the Kubernetes API for the cluster will become unavailable. Rancher recommends that you rotate the certificates before they expire to avoid an unexpected service interruption. The rotation is a one time operation, and the newly-generated certificates will be valid for the next 10 years.
The instructions below detail how to rotate the certificates in both Rancher-launched and RKE-provisioned clusters, both before expiry when certificates are still valid, and also in the event that the certificates have already expired.
Rotating Kubernetes certificates may result in your cluster being temporarily unavailable as components are restarted. For production environments, it’s recommended to perform this action during a maintenance window.
RKE clusters Launched by Rancher
Rancher provides UI support for certificate rotation (available since Rancher v2.2). If you are unable to upgrade your Rancher v2.0.x or v2.1.x instances to v2.2.x, then you can upgrade them to v2.0.15 and v2.1.10 respectively. These versions contain certificate rotation support via the API, and detailed steps for this can be found in the documentation.
Working Cluster / Valid Certs
To rotate the certificates on a Rancher-launched cluster for which certificates are still valid, follow these steps:
- As a preliminary step, update your cluster so it goes through the Rancher Kubernetes Engine (RKE) provisioning process. This refreshes the cluster state and configurations. To do so, you can either upgrade your cluster to a newer Kubernetes version or simply change one of the existing parameters on a cluster to trigger the cluster reconciliation process via RKE.
- To upgrade the Kubernetes version, browse to the cluster in the Rancher UI, click the vertical ellipses, and click
Edit
. Select the newerKubernetes Version
underCluster Options
and clickSave
. - To trigger reconciliation by changing a parameter with minimal impact, browse to the cluster in the Rancher UI, click the vertical ellipses and click
Edit
. ClickEdit as YAML
, updatechange addon_job_timeout
to50
, and clickSave
.
- To upgrade the Kubernetes version, browse to the cluster in the Rancher UI, click the vertical ellipses, and click
- Rotate the certificates:
- Rancher v2.2.4+: If you are running Rancher v2.2.4 or higher, you can rotate certificates from the UI. To do so, browse to the cluster in the Rancher UI, click the vertical ellipses, click
Rotate Certificates
, selectRotate all service certificates
and clickSave
. - Rancher v2.0.15 or v2.1.10: If you are running Rancher v2.0.15 or v2.1.10, perform the certificate rotation from the API, per the documentation.
- Rancher v2.2.4+: If you are running Rancher v2.2.4 or higher, you can rotate certificates from the UI. To do so, browse to the cluster in the Rancher UI, click the vertical ellipses, click
After following these steps, the certificates will be rotated and will have a validity of 10 years.
Non-working Cluster / Expired Certs
If your Rancher-launched Kubernetes cluster is already in an error state because the certificates have expired, follow these steps to rotate the certificate:
- Upgrade Rancher to v2.2.4 or greater.
- Open a shell session to the etcd and control plane nodes for the cluster and check if the directory
/etc/kubernetes/.tmp
contains the filekube-apiserver-requestheader-ca.pem
. If this file is absent, perform the following manual copy:cp /etc/kubernetes/.tmp/kube-ca.pem /etc/kubernetes/.tmp/kube-apiserver-requestheader-ca.pem cp /etc/kubernetes/.tmp/kube-ca-key.pem /etc/kubernetes/.tmp/kube-apiserver-requestheader-ca-key.pem cp /etc/kubernetes/.tmp/kube-apiserver.pem /etc/kubernetes/.tmp/kube-apiserver-proxy-client.pem cp /etc/kubernetes/.tmp/kube-apiserver-key.pem /etc/kubernetes/.tmp/kube-apiserver-proxy-client-key.pem
- To rotate certificates, browse to the cluster in the Rancher UI, click the vertical ellipses, click
Rotate Certificates
, selectRotate all service certificates
and clickSave
. - If the UI shows no activity on the cluster while the rotation is happening, and if the log still reports
Expired cert
, perform the steps described in Rancher Issue #20822. - After the rotation is finished, browse to the
Nodes
view for the cluster within the Rancher UI and check the state ofWorker
nodes. If the state is notActive
, do the following:- Copy the following certificates from a Kubernetes control plane node to each worker node, under the same location:
/etc/kubernetes/ssl/kube-node.pem /etc/kubernetes/ssl/kube-proxy.pem
- Restart the
kubelet
andkube-proxy
containers on each worker:docker restart kubelet docker restart kube-proxy
- Copy the following certificates from a Kubernetes control plane node to each worker node, under the same location:
Clusters Launched by the RKE CLI
If you are running Rancher in High Availability (HA) mode and used a version of RKE less than v0.2.0 to provision the cluster where the Rancher server has been installed via Helm, the certificates on that management cluster have to be rotated using the RKE CLI.
Prerequisites
Before conducting the certificate rotation, please verify the presence of the kube-apiserver-requestheader-ca.pem
file.
To do so, open a shell session to the etcd and control plane nodes for the cluster and check if the directory /etc/kubernetes/.tmp
contains the file kube-apiserver-requestheader-ca.pem
. If this file is absent, perform the following manual copy:
cp /etc/kubernetes/.tmp/kube-ca.pem /etc/kubernetes/.tmp/kube-apiserver-requestheader-ca.pem
cp /etc/kubernetes/.tmp/kube-ca-key.pem /etc/kubernetes/.tmp/kube-apiserver-requestheader-ca-key.pem
cp /etc/kubernetes/.tmp/kube-apiserver.pem /etc/kubernetes/.tmp/kube-apiserver-proxy-client.pem
cp /etc/kubernetes/.tmp/kube-apiserver-key.pem /etc/kubernetes/.tmp/kube-apiserver-proxy-client-key.pem
Working Cluster / Valid Certs
To rotate the certificates on an RKE v0.1.x provisioned cluster for which certificates are still valid, follow these steps:
- First ensure you have performed the steps under the Prequisites section above.
- Upgrade the RKE CLI to the latest version. The RKE releases and downloads can be found on GitHub.
- Run
rke up --config cluster.yml
to refresh your cluster. Note: Please ensure that both yourcluster.yml
configuration file and thekube_config_cluster.yml
file are present in the working directory when invokingrke
- Rotate the certificate using the following command:
rke cert rotate --config cluster.yml
Non-Working Cluster / Expired Certs
If your RKE provisioned cluster is already in an error state because the certificates have expired, follow these steps:
- First ensure you have performed the steps under the Prequisites section above.
- Upgrade the RKE CLI to the latest version. The RKE releases and downloads can be found on GitHub.
- Rotate the certificate using the following command:
rke cert rotate --config cluster.yml
. Note: Please ensure that both yourcluster.yml
configuration file and thekube_config_cluster.yml
file are present in the working directory when invokingrke
ChangeLog
- 2019-06-25: Updated to reflect additional prerequisites for clusters launched by the RKE CLI
Related Articles
Jan 30th, 2023