Enable or re-enable Cephx authentication on a SUSE Enterprise Storage Cluster
This document (7018435) is provided subject to the disclaimer at the end of this document.
Environment
SUSE Enterprise Storage 4
Situation
Resolution
1. Verify the current authentication information the cluster is aware of using "ceph auth list" and verify the output shows proper key information and Capabilities (caps). Below are excerpt example default key entries for the metadata server(s) (mds), Object Storage Daemons (osds) and client.* that should normally be present:
2. If required, update the Capabilities (caps) with something like the below command, replacing with appropriate key and desired values:mds.ses_server
key: AQBag6xXFRgTEhAA1LEgmlq52lMNfcOf7uL4vg==
caps: [mds] allow
caps: [mon] allow profile mds
caps: [osd] allow rwx...
osd.0
key: AQDVXSdXDJBKDRAA1Ij5Zyh+fR2TnaGZPg/CYQ==
caps: [mon] allow profile osd
caps: [osd] allow *
...
client.bootstrap-mds
key: AQAmUidXTe05EhAAE91HR//3LanzUFZypUkU8w==
caps: [mon] allow profile bootstrap-mds
client.bootstrap-osd
key: AQAlUidX0vBUNBAAI/39hYlFgz4Usj7sWYJlyA==
caps: [mon] allow profile bootstrap-osd
client.bootstrap-rgw
key: AQAmUidXUNCtBRAALMYWGmB499amzwh7WaIjLg==
caps: [mon] allow profile bootstrap-rgw
:~> ceph auth caps client.admin mon 'allow *' mds 'allow *' osd 'allow *'
NOTE: It is important when updating caps to always also again specify any already existing capabilities, running the command overwrites existing caps it does not update. For example running the above command and only specifying "mon 'allow *'" will remove the existing caps for mds and osd.
3. Verify the changes with:
:~> ceph auth get client.admin
exported keyring for client.admin
[client.admin]
key = AQAu9VtYX4brIhAAUHnOG6vx6rujHWC7hQjZXQ==
caps mds = "allow *"
caps mon = "allow *"
caps osd = "allow *"
4. If the output shows the proper expected information, edit the "ceph.conf" file in the local directory from where ceph-deploy was originally executed when the cluster was installed / configured on the admin node and un-remark / re-add the authenctication entries:
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
5. If present remark / remove:
auth cluster required = none
auth service required = none
auth client required = none
auth supported = none
NOTE: If there is a ceph.conf in the local directory from which the command below with step 6. is run, ceph-deploy will push this file to the other nodes.
6. Push the new change to all nodes (adjust node entries for the current environment):
:~> sudo ceph-deploy --overwrite-conf admin ceph_node-1 ceph_node-2 ceph_node-3 ...
7. Confirm the changes by verifying the "/etc/ceph/ceph.conf" files on some (or all) of the nodes by for example using from the admin node:
:~> ssh ceph_node-1 cat /etc/ceph/ceph.conf && ssh ceph_node-2 cat /etc/ceph/ceph.conf ...
NOTE: The above is assuming that the ssh key was copied over (for the user being used) to the cluster nodes, no authentication information should then be prompted for and only the output of the commands should be displayed.
8. To prevent re-balancing while re-starting the ceph services execute from one of the nodes:
:~> sudo ceph osd set noout
9. To restart the ceph cluster, one node at a time, restart the ceph related services by executing:
:~> sudo systemctl restart ceph.target
10. Verify the services started, for example the OSD services, with:
:~> sudo systemctl -all status ceph-osd@* | grep 'Active:'
NOTE: Once all services are running proceed and restart the services for the next node.
NOTE: If the "-all" option is not used with the command from 10. above inactive services will not be listed.
11. If for example the OSD services did not start (or not all services started) on the node check the status of each of the not running OSD services (replace XX with the relevant OSD number):
:~> systemctl status ceph-osd@XX.service
..
systemd[1]: ceph-osd@XX.service: Failed with result 'start-limit'.
...
12. If the above "start-limit" reason is listed for the failed services do the following for each of the failed OSD services:
:~> systemctl reset-failed ceph-osd@XX.service
:~> systemctl start ceph-osd@XX.service
13. The service should now start, if it STILL fails, first restart only all the MON services, if on any specific node after this step some of the OSD services still fail the only remaining option is likely to reboot the affected node.
14. Once the services on all the nodes have been restarted unset noout:
:~> sudo ceph osd unset noout
15. Finally verify cluster health with:
:~> sudo ceph -s
Cause
Disclaimer
This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.
- Document ID:7018435
- Creation Date: 04-Jan-2017
- Modified Date:03-Mar-2020
-
- SUSE Enterprise Storage
For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com