SUSE Support

Here When You Need Us

Salt SSH clients are reported as "not checking in" in SUSE Manager Server webUI

This document (000021289) is provided subject to the disclaimer at the end of this document.

Environment

SUSE Manager 4.3 Server

Situation

Multiple systems in DMZ registered against SUSE Manager Server and connected using Salt SSH contact method. The clients are reported as "not checking in" withing the SUSE Manager Server webUI.

However, all those clients can be operated correctly (meaning all actions executed against those clients from SUSE Manager server are correctly finished). When the taskomatic-service is restarted, all those clients are being checked properly for 1-2 days and then they are again reported as "not checking in".

Resolution

"Killing" the "frozen" Salt SSH process will help to get the system back to normal and the clients are again properly checked within SUSE Manager. 

Cause

It is caused by a bug in the Salt SSH contact method. It calls "scp" with "-o ConnectTimeout=360", but that handles only the connection timeout. If the command freezes after connection, then it never times out. SUSE Manager uses max. 4 ssh connections in parallel, so this blocks the other clients/minions too. Such a "frozen" connection can be seen in "basic-health-check.txt" file within the supportconfig output:
 
 salt     26828 26827  0.0  0.0  11024  4164 Ss+  00:00:00 scp -o KbdInteractiveAuthentication=no -o PasswordAuthentication=no -o GSSAPIAuthentication=no -o ConnectTimeout=360 -o Port=22 -o IdentityFile=/srv/susemanager/salt/salt_ssh/mgr_ssh_id -o User=root /usr/share/susemanager/salt-ssh/preflight.sh u99143.ptb.de:/tmp/preflight.sh

Additional Information

The following tuning options have been tested on top of the affected SUSE Manager without any impact on reported behavior:
salt_ssh_connect_timeout = 360
java.salt_presence_ping_timeout = 40
java.taskomatic_channel_repodata_workers = 5
taskomatic.java.maxmemory = 8192
taskomatic.ssh_push_workers = 50
The following error messages appear in taskomatic logs:
2023-07-20 09:13:00,141 [DefaultQuartzScheduler_Worker-12] WARN  com.redhat.rhn.taskomatic.task.SSHPush - Maximum number of workers already put ... skipping.
2023-07-20 09:14:00,121 [DefaultQuartzScheduler_Worker-17] WARN  com.redhat.rhn.taskomatic.task.SSHPush - Maximum number of workers already put ... skipping.

Disclaimer

This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.

  • Document ID:000021289
  • Creation Date: 11-Dec-2023
  • Modified Date:04-Jan-2024
    • SUSE Manager Server
    • SUSE Manager

< Back to Support Search

For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com

SUSE Support Forums

Get your questions answered by experienced Sys Ops or interact with other SUSE community experts.

Support Resources

Learn how to get the most from the technical support you receive with your SUSE Subscription, Premium Support, Academic Program, or Partner Program.

Open an Incident

Open an incident with SUSE Technical Support, manage your subscriptions, download patches, or manage user access.