SUSE Support

Here When You Need Us

Server is in hung state, caused by Trellix anti-virus software

This document (000021639) is provided subject to the disclaimer at the end of this document.

Environment

SUSE Linux Enterprise Server 15 SP5
SUSE Linux Enterprise Server for SAP 15 SP5
McAfee / Trellix Endpoint Security for Linux Threat Prevention 10.7.16.27

 


Situation

Server is hung. Access (even console access) is inoperative, active sessions on the server are all hung. Once the server is rebooted, it works fine for a short time and then hangs again.

 

 

Resolution

Trellix (AV vendor) provided the following recommendations: 
Add all Linux specific default exclusion. The default exclusions are provided below:

  arc, ctl, dbf, dbl, dtx, frm, jar, log, myd, myi, rdo, vmdk, war

Please add the above default exclusions for all the profiles (standard, highrisk and lowrisk)
Register the below process as a low-risk process and set not to scan

  /usr/sap/hostctrl/exe/sapstartsrv
  /usr/sap/ZCP/D05/exe/sapstartsrv
 
Please enable deferred scan with a CPU limit of 50, to avoid hang due to Fanotify response.
To enable deferred scan:   


To set the CPU limit :   


Additional information for Trellix customers:


A related TID for cross-referencing:

Cause

Crash analysis:

CPUS: 28
LOAD AVERAGE: 81.97, 53.42, 24.47

There is nothing happening on the CPUs actively (swapper is an idling process. This should not be confused with kswapd which handles swapping):

crash> bt -a | grep 'exception RIP' | sort | uniq -c | sort -nr
     28     [exception RIP: native_safe_halt+11]

crash> bt -a | grep 'COMMAND' | awk '{print $NF}' | awk -F\/ '{print $1}' | sort | uniq -c | sort -nr
     28 "swapper

Many of the processes are stuck in uninterruptible sleep (UN state):

crash> ps -S
  RU: 28
  UN: 85
  IN: 1375
  ID: 398

All the processes are stuck behind an fanotify/fsnotify event, awaiting a userspace process to unlock them:

crash> for UN bt | grep '#2 ' | awk '{if ($NF ~ /\[.*\]/) print $3" "$NF; else print $3;}' | sort | uniq -c | sort -nr
     85 fanotify_handle_event

crash> for UN bt | grep '#3 ' | awk '{if ($NF ~ /\[.*\]/) print $3" "$NF; else print $3;}' | sort | uniq -c | sort -nr
     85 fsnotify

crash> for UN bt | grep '#4 ' | awk '{if ($NF ~ /\[.*\]/) print $3" "$NF; else print $3;}' | sort | uniq -c | sort -nr
     85 __fsnotify_parent

The process that owns the fsnotify list, blocking 86 processes, belongs to an OAS process.

crash> for UN files | grep -Ei "command|fanotify" | grep -B1 "notify"
PID: 10137    TASK: ffff8af4bcdbc000  CPU: 27   COMMAND: "OAS Res Br<-Mgr"

crash> struct -x file.private_data ffff8af386c27400
  private_data = 0xffff8af2ca599a00,

crash> struct -x fsnotify_group.notification_list 0xffff8af2ca599a00
  notification_list = {
    next = 0xffff8af2ca6bcb88,
    prev = 0xffff8af2ce29f948
  },

crash> list 0xffff8af2ce29f948 | wc -l
86

Disclaimer

This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.

  • Document ID:000021639
  • Creation Date: 09-Dec-2024
  • Modified Date:10-Dec-2024
    • SUSE Linux Enterprise Server
    • SUSE Linux Enterprise Server for SAP Applications

< Back to Support Search

For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com

tick icon

SUSE Support Forums

Get your questions answered by experienced Sys Ops or interact with other SUSE community experts.

tick icon

Support Resources

Learn how to get the most from the technical support you receive with your SUSE Subscription, Premium Support, Academic Program, or Partner Program.

tick icon

Open an Incident

Open an incident with SUSE Technical Support, manage your subscriptions, download patches, or manage user access.