Roman Gushchin, who is part of Facebook’s Linux kernel engineering team, has discovered a “serious flaw” in the way that the current slab memory controller in Linux kernel works. He says (Via The New Stack)that existing kernel design causes low slab utilization and the reason behind it is that the slab pages are used only by one memory cgroup (Control group).
For those who are unaware, Slab allocation in case of Linux kernel refers to a memory management system that allocates memory to different kernel objects. Slab allocation is an efficient memory management system that has the primary job of building slab caches. A slab cache is a linked list of slabs with each slab representing an array of objects. Furthermore, cgroup or control group is a Linux kernel feature that organizes processes into a hierarchical manner.
What is the issue?
According to Gushchin, “If there are only few allocations of certain size made by a cgroup, or if some active objects (e.g. dentries) are left after the cgroup is deleted, or the cgroup contains a single-threaded application which is barely allocating any kernel objects, but does it every time on a new CPU: in all these cases the resulting slab utilization is very low.”
He adds that if kmem accounting or kernel memory accounting is disabled, the kernel could use the free space on slab pages for allocating processes.
Gushchin says that kmem controller was initially introduced as an optional feature that had to be explicitly turned on for each memory cgroup. Now the feature is turned on by default which defeats the purpose of slab utilization.
The new memory controller proposed by Guschin
The new slab controller proposed by Gushchin improves memory utilization by sharing slab pages. Additionally, the developer also adds that in his new system, accounting is performed per-object instead of per-page
He tested his proposed slab memory controller on different workloads and the results were as follows:
- Web-frontend: 650-700Mb, ~42% of slab memory
- Database cache: 750-800Mb, ~35% of slab memory
- dns server: 700Mb, ~36% of slab memory
You can refer to Gushchin’s proposed new memory controller at this lkml.org thread.
The controller proposed by Gushchin is under a “request for comments” status at the moment and if everything goes fine, it will be included in the mainline Linux kernel in 2020.