SUSE Support

Here When You Need Us

SLES 12 (or higher) NFS client has slow performance against NetApp NFS server

This document (7023238) is provided subject to the disclaimer at the end of this document.

Environment

SUSE Linux Enterprise Server 15
SUSE Linux Enterprise Server 12

Situation

A SLES 12 or 15 machine is acting as an NFS (Network File System) client to a NetApp NFS Server.   The SLES NFS client may have multiple mounts which point to the NetApp device, and a fair amount of work is being done.
 
From time to time, work being done on the NFS mounts bogs down and throughput is very low.  Once this happens, it may continue for an extended period of time.
 
This particular problem was not seen with SLES 11, only while using SLES 12 or 15.

Resolution

See the "cause" section for the reasons behind this solution:
 
To limit the number of outstanding RPC requests that can be present on a single NFS connection, but still allow high performance, there are two settings which can be useful.  Either one might be enough to relieve the symptoms, but understanding them both may be helpful to implementing a solid strategy.

Neither of these tunables will come into effect for file systems which are already mounted.  You must first umount all of a client's NFS mounts (which point to a certain NFS Server).  Only after they are ALL gone can you mount them again and obtain the new results.  Since this is sometimes troublesome (especially because a file system will not umount if it is considered "busy") it is often easier, faster, and safer for applications, to reboot the client after making these configuration changes.
 
1.  In /etc/sysctl.conf
 
sunrpc.tcp_max_slot_table_entries = 128
 
After modifying that file, the value isn't present in memory until "sysctl -p" is executed.   However,  the value still won't come into effect within NFS mounts unless you reboot, or at least umount all NFS mounts and then mount them again.  (umount them all first, do not umount and re-mount one by one.)
 
This should be set when NetApp NFS Servers are involved.  This value defaults to 65536, but NetApp devices do not like seeing more than 128.  This setting controls how large the slot table can grow.  It will not start at the size specified here.  In newer kernels (present in SLES 12 or higher), the slot table is auto-tuned and the slot table grows as needed, up to this max. 
 
2.  When multiple mounts are present at one nfs client and point to the same NFS server, they will typically share one TCP connection (and will share other resources, as well).  This means that one slot table may be servicing multiple NFS mounts.  It is possible to force mounts to use multiple connections (and therefore have access to multiple separate slot tables) with the client mount parameter:
 
nconnect=n
 
Where "n" is a number of connections to establish between this client and the target NFS Server.  The number can be from 1 to 16.
 
The first mount performed (for the combination of a particular NFS Server target and NFS Protocol version) will establish the "nconnect" value in use between those 2 devices.  Later mounts cannot change the number of connections between that client and server.  All such mounts would need to be umounted in order for a new nconnect setting to be specified and take effect.  The default (if no nconnect value is specified) is 1.

Using the above 2 methods together:
 
When multiple mounts are present, pointing to the same NetApp server, it is wise to consider using the above 2 methods together.  If the per-connection slot table is being limited to 128 in order to "protect" the NetApp device, then systems using multiple NFS mounts (which point to the same NetApp server) may need to use multiple TCP connections.  While under a low load, 1 (or a few) mounts might be able to perform well with one connection and 128 slots.  But one (or several) busy mounts may begin to suffer from this limitation, especially in cases where other bottlenecks are also effecting performance.  Therefore, a limit of 128 outstanding requests per connection is best combined with the use of multiple connections.

Cause

NetApp NFS Servers usually want to limit the number of outstanding RPC requests that can accumulate on one TCP connection.  If it goes beyond 128, the NetApp device may consider a potential denial-of-service attack to be occurring, and may throttle that TCP connection by setting its TCP receive window to 0.  This prevents further data from being accepted until the window is opened up again.
 
On Linux NFS Clients, the number of simultaneous outstanding RPC requests that can be issued on one TCP connection is controlled by the tcp slot table size, which is also known as "sunrpc.tcp_slot_table_entries".   In older Linux kernels, such as those in SLES 11, this table size defaulted to 16, and if a change was desired, it had to be changed manually.  However, in newer kernels (such as in SLES 12 or 15), the size of the table is auto-tuned and will become larger as needed (up to the value of "sunrpc.tcp_max_slot_table_entries").  It is very common for the table size to grow beyond 128, especially if multiple mounts at a client point to the same NFS Server.  Therefore, it has become common for NetApp servers to throttle Linux client connections.

Additional Information

Over the history of SLES releases, there have been different options to control connection sharing.  nconnect= is only one of them, but it is the one which will carry forward.  Some are only available in certain distributions.  Some are meant for certain versions of the NFS protocol.  Some are being phased out.  For a full discussion of the options available, see:
https://www.suse.com/support/kb/doc/?id=000019933

Disclaimer

This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.

  • Document ID:7023238
  • Creation Date: 01-Aug-2018
  • Modified Date:14-Nov-2024
    • SUSE Linux Enterprise Server

< Back to Support Search

For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com

tick icon

SUSE Support Forums

Get your questions answered by experienced Sys Ops or interact with other SUSE community experts.

tick icon

Support Resources

Learn how to get the most from the technical support you receive with your SUSE Subscription, Premium Support, Academic Program, or Partner Program.

tick icon

Open an Incident

Open an incident with SUSE Technical Support, manage your subscriptions, download patches, or manage user access.