mds daemon fail to start with "FAILED assert(g_conf->mds_wipe_sessions)" message
This document (000020284) is provided subject to the disclaimer at the end of this document.
Environment
Situation
Below is the crash: 0> 2021-06-05 21:21:09.112447 7f3a4baf6700 -1 /home/abuild/rpmbuild/BUILD/ceph-12.2.13-706-gff66d09906/src/mds/journal.cc: In function 'void EMetaBlob::replay(MDSRank*, LogSegment*, MDSlaveUpdate*)' thread 7f3a4baf6700 time 2021-06-05 21:21:09.109003 /home/abuild/rpmbuild/BUILD/ceph-12.2.13-706-gff66d09906/src/mds/journal.cc: 1602: FAILED assert(g_conf->mds_wipe_sessions) ceph version 12.2.13-706-gff66d09906 (ff66d09906a7c2d8f4dbf1d17cbdfce9c10483ca) luminous (stable) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x10e) [0x5654ce7c3e7e] 2: (EMetaBlob::replay(MDSRank*, LogSegment*, MDSlaveUpdate*)+0x4506) [0x5654ce76cf46] 3: (EUpdate::replay(MDSRank*)+0x26) [0x5654ce76e906] 4: (MDLog::_replay_thread()+0x602) [0x5654ce725412] 5: (MDLog::ReplayThread::entry()+0xd) [0x5654ce4af7bd] 6: (()+0x96b4) [0x7f3a58a5d6b4] 7: (clone()+0x6d) [0x7f3a57a8a2dd] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Resolution
- Ensure that no clients are trying to connect to cephfs. - Add "mds_wipe_sessions = true" to the ceph.conf in the [global] or [mds] section. - Start the mds daemon. If it starts successfully, remove "mds_wipe_sessions = true" setting and try restarting the mds again, to ensure the mds daemon will not crush without this option any more. If it still crashes, set it back again.
Cause
> 2021-06-05 00:00:03.309852 7f51730f0700 -1 log_channel(cluster) log [ERR] : error replaying open sessions(0) sessionmap v 223454462 table 0
It tells that currently it has 0 sessions (sessionmap is empty) and the error is because when replaying a session journal event, the current sessionmap version is unexpectedly 0, which is expected to be not less than the replaying event version, which is 223454462.
And after setting the parameter 'mds_wipe_sessions to true', instead of crashing it will wipe the sessions from the sessionmap (which is already 0 anyway) and will bump its version to the event version.
Status
Disclaimer
This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.
- Document ID:000020284
- Creation Date: 11-Jun-2021
- Modified Date:11-Jun-2021
-
- SUSE Enterprise Storage
For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com