[Devel] [PATCH RHEL9 COMMIT] FD: mm: Memory cgroup page cache limit
Konstantin Khorenko
khorenko at virtuozzo.com
Thu Feb 9 15:29:32 MSK 2023
The commit is pushed to "branch-rh9-5.14.0-162.6.1.vz9.18.x-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git
after rh9-5.14.0-162.6.1.vz9.18.8
------>
commit 28ad371276436977993558273c1e6d6b2fba1a15
Author: Konstantin Khorenko <khorenko at virtuozzo.com>
Date: Thu Feb 9 15:27:58 2023 +0300
FD: mm: Memory cgroup page cache limit
https://jira.sw.ru/browse/PSBM-78244
Signed-off-by: Konstantin Khorenko <khorenko at virtuozzo.com>
Feature: mm: Memory cgroup page cache limit
---
.../mm-Memory-cgroup-page-cache-limit.rst | 70 ++++++++++++++++++++++
1 file changed, 70 insertions(+)
diff --git a/Documentation/Virtuozzo/FeatureDescriptions/mm-Memory-cgroup-page-cache-limit.rst b/Documentation/Virtuozzo/FeatureDescriptions/mm-Memory-cgroup-page-cache-limit.rst
new file mode 100644
index 000000000000..092ffab52f53
--- /dev/null
+++ b/Documentation/Virtuozzo/FeatureDescriptions/mm-Memory-cgroup-page-cache-limit.rst
@@ -0,0 +1,70 @@
+==================================
+mm: Memory cgroup page cache limit
+==================================
+
+The feature enhances memory cgroup to be able to limit its page cache
+usage.
+
+Feature exposes two memory cgroup files to set limit and to check
+usage::
+
+ memory::memory.cache.limit_in_bytes
+ memory::memory.cache.usage_in_bytes
+
+Background:
+===========
+
+Imagine a system service which anon memory you don't want to limit.
+In our case it's a vStorage cgroup which hosts CSes and MDSes:
+
+ * they can consume memory in some range
+ * we don't want to set a limit for max possible consumption - too high
+ * we don't know the number of CSes on the node - admin can add CSes
+ dynamically
+ * we don't want to dynamically increase/decrease the limit
+
+If the cgroup is "unlimited" it produces permanent memory pressure on
+the Node because it generates a lot of pagecache and other cgroups on
+the Node are affected (even taking into account the fact of proportional
+fair reclaim).
+
+=> the solution is to limit pagecache only, so this is implemented.
+
+Implementation details:
+=======================
+
+ * Reclaiming memory above memory.cache.limit_in_bytes always in direct
+ reclaim mode adds too much of a cost for vStorage. Instead of direct
+ Thus the code allows to overflow memory.cache.limit_in_bytes but
+ launches the reclaim in background task.
+
+ * Per-cpu stock precharges are used for ->cache counter to decrease the
+ contention on this counter.
+
+Differences in vz7/vz9 implementation:
+--------------------------------------
+
+ * vz9 does not use the page vz extensions in favor of using a memcg_data
+ bit to mark a page as cache. The benefit is that the implementation
+ and porting got more simple. If we require new flags then the newly
+ introduced folio can be used.
+
+Testing:
+========
+
+Simple test::
+
+ # dd if=/dev/random of=testfile.bin bs=1M count=1000
+ # mkdir /sys/fs/cgroup/memory/pagecache_limiter
+ # tee /sys/fs/cgroup/memory/pagecache_limiter/memory.cache.limit_in_bytes <<< $[2**24]
+ # bash
+ # echo $$ > /sys/fs/cgroup/memory/pagecache_limiter/tasks
+ # cat /sys/fs/cgroup/memory/pagecache_limiter/memory.cache.usage_in_bytes
+ # time wc -l testfile.bin
+ # cat /sys/fs/cgroup/memory/pagecache_limiter/memory.cache.usage_in_bytes
+ # echo 3 > /proc/sys/vm/drop_caches
+ # cat /sys/fs/cgroup/memory/pagecache_limiter/memory.cache.usage_in_bytes
+
+
+https://jira.sw.ru/browse/PSBM-77547 - initial problem
+https://jira.sw.ru/browse/PSBM-78244 - feature jira ID
More information about the Devel
mailing list