Non-Kerberos NFS mounts experience delays or permissions problems if Kerberos is configured for other reasons
This document (000020779) is provided subject to the disclaimer at the end of this document.
Environment
SUSE Linux Enterprise Server 15
Situation
1. A Linux NFS client is attempting to mount a v4 share. The mount attempt may take 1 - 4 minutes to complete, or it might hang. If the attempt is part of boot, it may delay the boot significantly, during which the slow mount attempt may abort rather than complete.
There may be many reasons for mount delays or hangs, but in this particular situation the NFS client machine has Kerberos configured for certain services, but not specifically for NFS use.
2. A Linux NFS client has mounted an NFS share successfully, but afterwards "permission denied" errors come up even when filesystem permissions should be adequate. For example, a "ls" command might show some permission denied errors, but still show a partial directory list. It might show some lines of output with a lot of questions marks (????) present. This behavior could signify a lot of different problems, but one potential cause is confusion about whether kerberos should be used.
Resolution
Kerberos usage (in one form or another) is becoming more frequently found on modern systems. As a side effect, it is becoming more common for the NFS client layer to test using kerberos or believe it should be used even when it is not the administrator's intention. If the NFS client decides to use kerberos, but full support for NFS hosts or NFS users is not really present in the kerberos databases, malfunctions can occur.
1. The first thing to try would be to explicitly set "sec=sys" as one of the NFS mount options. This might avoid the confusion about whether kerberos should be used. This can be placed in /etc/fstab or in autofs maps that might be in use. Or in a command line mount, it would be specified as one of the -o options.
2. Very often, however, a change bigger than #1 above is needed. We can forcibly block NFS's ability to query Kerberos. This can be done with the commands:
systemctl mask rpc-gssd.service systemctl mask rpc-svcgssd.service
With these masked, subsequent reboots will not start those services, and the NFS components will not be able to try to use Kerberos.
In many cases, a reboot must be done at this point, to get the full effect of this change. However, to attempt this without a reboot, then besides masking the services above, you can also stop them with:
systemctl stop rpc-gssd systemctl stop rpc-svcgssdAfter they are both masked and stopped, try the mount again. If it still fails, a reboot is most likely necessary.
FYI, these services provide kerberos access for RPC-based services that know how to use them. Typically, NFS is the only RPC service which does so.
3. It can also help to track down and correct any kerberos problems. For example, in one case it was found that none of the KDCs known to the Linux NFS client machine were reachable over the network. This caused timeout delays while KDC connections were retried several times. For more details, see the "Cause" section.
Cause
Specifics:
In cases where Kerberos has not been set up for any reason, there will be no keytab file (/etc/krb5.keytab). The absence of that file will correctly cause rpc-gssd and rpc-svcgssd services to abort (fail to start). When Kerberos is not set up, this is the desired result. This prevents NFS from making attempts to use Kerberos.
If rpc-gssd or rpc-svcgssd have aborted because there is no /etc/krb5.keytab file, "systemctl status <service-name>" against those services will show a message such as:
Condition: start condition failed at Wed 2022-09-21 10:46:25 MDT; 14min ago
ConditionPathExists=/etc/krb5.keytab was not met
The above message is normal.
However, if /etc/krb5.keytab does exist, these services will successfully start and NFS can make queries about NFS credentials. Once such queries are made, if positive results are found, the client will continue to try to use kerberos, even though that may not be the administrator's intention. In contrast, if a negative answer comes back quickly (saying NFS credentials do not exist), NFS should quickly fall back to auth-sys.
There is a 3rd possible outcome, however. If the queries are not answered at all, this means that Kerberos is malfunctioning, or its configuration is bad, or KDCs cannot be reached, etc. In those cases, a series of many independent 30 second timeouts and retries will occur, adding up to very significant delays. Possibly, a problem of indefinite duration ("hang") could occur.
Additional Information
Be aware that the actual daemon names contain dots (rpc.gssd and rpc.svcgssd) whereas systemd treats dots differently. That is why systemd's naming replaces dots in daemon names with hyphens. This results in service names such as rpc-gssd.service and rpc-svcgssd.service.
The service rpc-gssd is used by both NFS Servers and Clients.
The service rpc-svcgssd is used by NFS Servers.
The nfs-client package (built from nfs-utils source package) provides rpc.gssd and rpc.svcgssd. These services could technically allow other RPC-based services to use Kerberos if those other services were written to take advantage of them. However, this author does not know of any RPC service (other than NFS) which does so.
Disclaimer
This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.
- Document ID:000020779
- Creation Date: 21-Sep-2022
- Modified Date:07-Jun-2024
-
- SUSE Linux Enterprise Server
For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com