Address space monitoring and HANA DB performance
This document (000020746) is provided subject to the disclaimer at the end of this document.
Environment
SUSE Linux Enterprise Server for SAP 15 and All Service Packs
SUSE Linux Enterprise Server 12 and All Service Packs
SUSE Linux Enterprise Server for SAP 12 and All Service Packs
Situation
The system was monitored using atop with the -R option.
Queries taking unexpectedly long time in some cases while performance is usually in line.
One or more proc files were read to monitor the system including, but not limited to: /proc/<pid>/smaps, /proc/<pid>/smaps_rollup, /proc/<pid>/numa_maps, /prov/<pid>/pagemap.
Running strace -Ttt -e trace=memory shows >1s syscall runtime
148539 15:08:17.077800 <... get_mempolicy resumed>[0x6 /* MPOL_??? */], NULL, 0, 0x7bbfa7a13ec0, MPOL_F_NODE|MPOL_F_ADDR) = 0 <36.665900> 147728 15:08:17.077878 <... get_mempolicy resumed>[MPOL_PREFERRED], NULL, 0, 0x7f1b081199a0, MPOL_F_NODE|MPOL_F_ADDR) = 0 <33.571746> 117292 15:08:17.077904 <... get_mempolicy resumed>[MPOL_PREFERRED], NULL, 0, 0x7c241df05360, MPOL_F_NODE|MPOL_F_ADDR) = 0 <33.241993> 116802 15:08:17.077925 <... get_mempolicy resumed>[MPOL_PREFERRED], NULL, 0, 0x7d4caee8a9e0, MPOL_F_NODE|MPOL_F_ADDR) = 0 <35.166693> 143937 15:08:17.077944 <... get_mempolicy resumed>[MPOL_PREFERRED], NULL, 0, 0x7c226d046520, MPOL_F_NODE|MPOL_F_ADDR) = 0 <28.796564> 117837 15:08:17.077966 <... get_mempolicy resumed>[MPOL_INTERLEAVE], NULL, 0, 0x7ee209ceb800, MPOL_F_NODE|MPOL_F_ADDR) = 0 <28.983857> 88575 15:08:17.077989 <... get_mempolicy resumed>[MPOL_INTERLEAVE], NULL, 0, 0x7c8e8f486000, MPOL_F_NODE|MPOL_F_ADDR) = 0 <26.095425> 114038 15:08:17.078015 <... get_mempolicy resumed>[0x6 /* MPOL_??? */], NULL, 0, 0x7bbf93add800, MPOL_F_NODE|MPOL_F_ADDR) = 0 <24.056954> 20426 15:08:17.078042 <... get_mempolicy resumed>[0x7 /* MPOL_??? */], NULL, 0, 0x7d0a2c5181e0, MPOL_F_NODE|MPOL_F_ADDR) = 0 <32.503773> 149616 15:08:17.078066 <... get_mempolicy resumed>[0x7 /* MPOL_??? */], NULL, 0, 0x7c082d905400, MPOL_F_NODE|MPOL_F_ADDR) = 0 <13.463001> 148542 15:08:17.078090 <... get_mempolicy resumed>[0x6 /* MPOL_??? */], NULL, 0, 0x7c03beff2400, MPOL_F_NODE|MPOL_F_ADDR) = 0 <8.519618> 132662 15:08:17.078114 <... mmap resumed>) = 0x7f44c888e000 <40.743437>The user of the respective /proc file is blocked in reading it, example:
PID: 278515 TASK: ffff920d630f0000 CPU: 252 COMMAND: "atop" #5 [ffffac424b5cbbe8] smaps_account at ffffffffb017daee #6 [ffffac424b5cbc18] smaps_pte_range at ffffffffb017ee30 #7 [ffffac424b5cbc60] __walk_page_range at ffffffffb0088cb1 #8 [ffffac424b5cbd40] walk_page_vma at ffffffffb00894c3 #9 [ffffac424b5cbd80] show_smaps_rollup at ffffffffb017df6d #10 [ffffac424b5cbe68] seq_read at ffffffffb011f458 #11 [ffffac424b5cbec8] vfs_read at ffffffffb00f5069 #12 [ffffac424b5cbef8] ksys_read at ffffffffb00f53f5 #13 [ffffac424b5cbf38] do_syscall_64 at ffffffffafe0542b #14 [ffffac424b5cbf50] entry_SYSCALL_64_after_hwframe at ffffffffb08000a9 RIP: 00007f20f0b35a5e RSP: 00007ffda052e6d8 RFLAGS: 00000246 RAX: ffffffffffffffda RBX: 0000000000da12a0 RCX: 00007f20f0b35a5e RDX: 0000000000000400 RSI: 0000000000da1480 RDI: 0000000000000007 RBP: 00007f20f0e1f800 R8: 0000000000da1480 R9: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000000a R13: 0000000000000d68 R14: 00007f20f0e1ec00 R15: 0000000000000d68 ORIG_RAX: 0000000000000000 CS: 0033 SS: 002bHANA related processes/threads (e.g. hdbindexserver or job workers etc) would be blocked (check via /proc/<pid of hana process>/task/<thread_id>/stack or from echo t > /proc/sysrq-trigger output) are blocked on rw semaphore
Examples:
Worker blocked on address space modification operation
PID: 313590 TASK: ffff94c0f41ac000 CPU: 18 COMMAND: "JobWrk49426" #0 [ffffac424d95bd58] __schedule at ffffffffb0753a1f #1 [ffffac424d95bde8] schedule at ffffffffb0753eaf #2 [ffffac424d95bdf8] rwsem_down_write_slowpath at ffffffffaff00f65 #3 [ffffac424d95beb8] down_write_killable at ffffffffb0756923 #4 [ffffac424d95bec8] down_write_killable at ffffffffb0756923 #5 [ffffac424d95bed8] __vm_munmap at ffffffffb00810ff #6 [ffffac424d95bf20] __x64_sys_munmap at ffffffffb00811a7 #7 [ffffac424d95bf38] do_syscall_64 at ffffffffafe0542b #8 [ffffac424d95bf50] entry_SYSCALL_64_after_hwframe at ffffffffb08000a9Worker blocked on the page fault
PID: 326143 TASK: ffff920dd6f34000 CPU: 414 COMMAND: "JobWrk50240" #0 [ffffac424643fd78] __schedule at ffffffffb0753a1f #1 [ffffac424643fe08] schedule at ffffffffb0753eaf #2 [ffffac424643fe18] rwsem_down_read_slowpath at ffffffffb0756ab2 #3 [ffffac424643fec0] __do_page_fault at ffffffffafe873d9 #4 [ffffac424643ff20] do_page_fault at ffffffffafe87570 #5 [ffffac424643ff50] page_fault at ffffffffb080135e
Resolution
Cause
Status
Disclaimer
This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.
- Document ID:000020746
- Creation Date: 02-Sep-2022
- Modified Date:02-Sep-2022
-
- SUSE Linux Enterprise Server
- SUSE Linux Enterprise Server for SAP Applications
For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com