What is a kernel thread?
The title suggests there exists one such concept as a kernel thread. The reality is much more complex than mere kernel threads and the rest. The practical need of differentiation popped up when systemd service manager needed to terminate all processes except for kernel threads.
Ideally, the kernel threads would stay in the background doing their jobs and no users would have to pay extra attention to them. Why should you care if you are not writing your own service manager?
- Sometime you might be tempted to kill them (which is challenging), then you want to know how to distinguish them from non-kernel threads.
- In other cases you might wonder who is that consumer of all your CPU cycles.
- Additionally, it’s also a mental exercise to contemplate about boundaries in the Linux kernel (that evolved organically).
Recognizing various flavors of kernel thread
A scholarly definition tells that a kernel thread spends all its runtime in the kernel mode. As it comes, it provides some grounding but it is not very practical. Next, we will enumerate several (different) methods for detecting kernel threads.
Fortunately, Linux marks some threads with a flag PF_KTHREAD (0x00200000). Userspace is not typical consumer of this information, so its retrieval is slightly cumbersome, for instance:
awk '{print and($9, 0x200000)}' /proc/$pid/stat # 0 -- no flag, >0 -- PF_KTHREAD
Or we can exploit the fact that kernel threads do not have their own (user) memory address space and we check the size of their code in memory:
awk '{print $4}' /proc/$pid/statm # 0 -- no (user) code, >0 -- has (user) code
Another method relies on the fact that Linux has conventional internal API for kthread creation and all such kernel threads are forked from the task with PID 2 (kthreadd). This is the category where so called kworkers fall into, they are a subspecies of kernel threads responsible for workqueues. Extending on this, any process with PID 2 in its forking ancestry is kind of a kernel thread.
awk '{print $4}' /proc/$pid/stat # check parent PID # 2 -- kthread (1st level), >2 -- check PID recursively, 0 -- kthreadd or init
What about PID 2 itself? It is a child of PID 0. We could construct another definition: Kernel thread is any task whose top ancestor is PID 0 except for tasks originating from PID 1 (as init spawns all user space).
Let’s nurture the concept of ancestry. There is another thread trait that copies ancestry. Threads inherit cgroup membership upon fork, therefore all descendants of a given task remain in the same cgroup. (Unless explicitly migrated, of course.) Kernel threads would be exactly those threads residing in the root cgroup. (Let’s assume PID 1 migrates itself explicitly soon after boot. This holds in the case of systemd.)
cat /proc/$pid/cgroup # 0::/ -- kernel thread # 0::/system.slice/sshd.service -- user space process (example)
Kernel thread with userspace origin
Have we covered all Linux world cases with the several methods above? No, Linux can create a task that appears user-like with these methods but still honors the generic initial definition of never running in the user space. (Contemporary instances are PF_IO_WORKER threads facilitating in-kernel io_uring submission queue polling.)
Sending signals
Our original interest was process termination (with signals). Signals are delivered when the target task transitions from kernel to user mode. This implies that kernel threads (per the scholarly definition) only terminate of their own volition since they lack the point for traditional signal delivery.
How does this confront with the Linux reality? For kernel threads that never cross the kernel-user mode boundary, signals are rather just messages (well, signals) whose delivery is based on polling and they individually decide whether and how they respond to such a message (a concise example is lockd kthread in fs/lockd/svc.c). Full explanation of signal delivery and handling in various receivers would require more screen space than this single post.
The ancestry definitions of kernel threads admit that a kernel thread exec‘s a program and gains a user (address) space, while maintaining its ancestry. Such processes go under (infamous) name user mode helpers. The helpers can be killed (asynchronously at kernel-user transition) as regular processes. This eventuality may easily lead to troubles due to the kernel ancestry (user space cannot wait for them whereas kernel must (and handle it properly)).
This short article sums various practical characteristics of Linux tasks that helps to categorize them into kernel and non-kernel threads (not to be confused with M:N thread scheduling). It illustrates the distinction is not sharp and provides some quick and therefore imprecise checks that should be helpful when dealing with kernel threads on Linux.
Related Articles
Oct 16th, 2024