Cluster status reports MDSs behind on trimming
This document (000019740) is provided subject to the disclaimer at the end of this document.
Environment
Situation
HEALTH_WARN <x> MDSs behind on trimming
HEALTH_WARN x clients failing to respond to cache pressure
Resolution
ceph config set mds mds_cache_trim_decay_rate x.x (should initially be decreased)
ceph config set mds mds_recall_max_caps xxxx (should initially be increased)
ceph config set mds mds_recall_max_decay_rate x.xx (should initially be decreased)
Cause
Additional Information
Note that the adjusted settings when set as per the resolution section are not permanent and will revert back to default once a MDS is restarted. Specifically regarding the "mds_cache_memory_limit", this is dependent on the total amount of memory available on the server. If feasible, double the current setting.
If the "MDS behind on trimming" warnings are fixed by the customized settings and no adverse effects can be observed (concerns would be high CPU load of the MDS and a slowdown in metadata operations on the client side), consider setting the adjusted mds_cache_trim.* settings permanently.
Also see TID 000019591: When running "du" command on a cephfs mount, ceph -s reports 1 MDSs report oversized cache.
To get more details on the clients caps usage, the following commands can be useful:
ceph daemonperf mds.<ins_mds_server_name> (needs to be executed on the MDS host)
Disclaimer
This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.
- Document ID:000019740
- Creation Date: 21-Oct-2020
- Modified Date:24-Nov-2021
-
- SUSE Enterprise Storage
For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com