Performing a `ceph orch restart mgr` results in endless restart loop
This document (000020530) is provided subject to the disclaimer at the end of this document.
Environment
Situation
# ceph orch restart mgrthe SES mgr daemons will continually restart, in what appears a forever loop.
When running commend,
# ceph config-key dumpthis will show the following output:
... "mgr/cephadm/host.node1": "{\"... \"scheduled_daemon_actions\": {\"mgr.node1.puuiwd\": \"restart\"}}" ...(and so on for other mgr instances on other nodes)
Resolution
This was reported on the ceph-users ML a few weeks ago, with subject '"ceph orch restart mgr" creates manager daemon restart loop'.
Adam King's suggestion was to move the mgr instance to another host, then re-apply the config to the original hosts to get it redeployed.
In my testing, I found another workaround is to run `ceph orch daemon rm` each mgr instance one after another, and they'll automatically be redeployed but with different random IDs. The scheduled restart action never goes away, but it doesn't matter anymore, because the daemon ID changed.
Resolved with SES7.1 (ceph pacific 16.2.7-650). Upgrade the cluster to SES7.1 to resolve this issue. SEE:
https://documentation.suse.com/ses/7.1/single-html/ses-deployment/#book-storage-deployment
Cause
Status
Disclaimer
This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.
- Document ID:000020530
- Creation Date: 23-Dec-2021
- Modified Date:30-Mar-2022
-
- SUSE Enterprise Storage
For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com