[Devel] [PATCH RHEL COMMIT] mm, memcg: add oom counter to memory.stat memcgroup file

Konstantin Khorenko khorenko at virtuozzo.com
Thu Sep 30 17:44:04 MSK 2021


The commit is pushed to "branch-rh9-5.14.vz9.1.x-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git
after ark-5.14
------>
commit 33dc8bceb845b44cea93a08b8f2a9a681ceff616
Author: Andrey Ryabinin <ryabinin.a.a at gmail.com>
Date:   Thu Sep 30 17:44:04 2021 +0300

    mm, memcg: add oom counter to memory.stat memcgroup file
    
    Add oom counter to memory.stat file. oom shows amount of oom kills
    triggered due to cgroup's memory limit. total_oom shows total sum of
    oom kills triggered due to cgroup's and it's sub-groups memory limits.
    
    memory.stat in the root cgroup counts global oom kills.
    
    E.g:
     # mkdir /sys/fs/cgroup/memory/test/
     # echo 100M > /sys/fs/cgroup/memory/test/memory.limit_in_bytes
     # echo 100M > /sys/fs/cgroup/memory/test/memory.memsw.limit_in_bytes
     # echo $$ > /sys/fs/cgroup/memory/test/tasks
     # ./vm-scalability/usemem -O 200M
     # grep oom /sys/fs/cgroup/memory/test/memory.stat
       oom 1
       total_oom 1
     # echo -1 > /sys/fs/cgroup/memory/test/memory.memsw.limit_in_bytes
     # echo -1 > /sys/fs/cgroup/memory/test/memory.limit_in_bytes
     # ./vm-scalability/usemem -O 1000G
     # grep oom /sys/fs/cgroup/memory/memory.stat
        oom 1
        total_oom 2
    
    https://jira.sw.ru/browse/PSBM-108287
    Signed-off-by: Andrey Ryabinin <aryabinin at virtuozzo.com>
    
    khorenko@ notes:
    1) non-root memcg:
       * oom - number of OOMs caused by limit exceeded by this particular memcg
       * total_oom - number of OOMs caused by limit exceeded by this particular
         memcg and all nested memory cgroups.
    
       Note: as the current memcg is not root, then it's "local" OOM,
       thus processes were killed in the same memcg which caused it.
    
    2) root memcg:
       * oom - the number of global OOMs happened
       * total_oom - the number of global OOMs + number of OOMs caused by limit
         exceeded in all nested memory cgroups
    
    Note: root memory cgroup cannot be limited => no OOMs can be cause by its limit.
    
    +++
    mm, memcg: Fix "add oom counter to memory.stat memcgroup file"
    
    Fix rebase of commit 3f10e0c1a0df12a2a503d0d9a3ec7b4f3ac3a467
            Author: Andrey Ryabinin <aryabinin at virtuozzo.com>
            Date: Mon Oct 5 13:18:40 2020 +0300
    
            mm, memcg: add oom counter to memory.stat memcgroup file
    
    https://jira.sw.ru/browse/PSBM-123537
    Signed-off-by: Andrey Ryabinin <aryabinin at virtuozzo.com>
    
    +++
    mm/memcontrol: fix oom counting
    
    Fix oom_total in /sys/fs/cgroup/memory/memory.stat being always zero due to
    incrementing pointer instead of value in accumulate_ooms().
    
    Drop cond_resched() - looks like an overkill to call it after each two
    atomic_long_read()
    
    Signed-off-by: Andrey Zhadchenko <andrey.zhadchenko at virtuozzo.com>
    
    (cherry-picked from vz8 commit 96d0009464f0 ("mm, memcg: add oom counter to
    memory.stat memcgroup file"))
    
    Signed-off-by: Nikita Yushchenko <nikita.yushchenko at virtuozzo.com>
---
 mm/memcontrol.c | 35 +++++++++++++++++++++++++++++++++++
 1 file changed, 35 insertions(+)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 65dec706430d..e92919d8ce06 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -4122,12 +4122,28 @@ static const unsigned int memcg1_events[] = {
 	PSWPOUT,
 };
 
+static void accumulate_ooms(struct mem_cgroup *memcg, unsigned long *oom,
+			unsigned long *kill)
+{
+	struct mem_cgroup *mi;
+	unsigned long total_oom_kill = 0, total_oom = 0;
+
+	for_each_mem_cgroup_tree(mi, memcg) {
+		total_oom += atomic_long_read(&mi->memory_events[MEMCG_OOM]);
+		total_oom_kill += atomic_long_read(&mi->memory_events[MEMCG_OOM_KILL]);
+	}
+
+	*oom = total_oom;
+	*kill = total_oom_kill;
+}
+
 static int memcg_stat_show(struct seq_file *m, void *v)
 {
 	struct mem_cgroup *memcg = mem_cgroup_from_seq(m);
 	unsigned long memory, memsw;
 	struct mem_cgroup *mi;
 	unsigned int i;
+	unsigned long total_oom = 0, total_oom_kill = 0;
 
 	BUILD_BUG_ON(ARRAY_SIZE(memcg1_stat_names) != ARRAY_SIZE(memcg1_stats));
 
@@ -4148,6 +4164,20 @@ static int memcg_stat_show(struct seq_file *m, void *v)
 		seq_printf(m, "%s %lu\n", vm_event_name(memcg1_events[i]),
 			   memcg_events_local(memcg, memcg1_events[i]));
 
+
+	accumulate_ooms(memcg, &total_oom, &total_oom_kill);
+
+	/*
+	 * For root_mem_cgroup we want to account global ooms as well.
+	 * The diff between all MEMCG_OOM_KILL and MEMCG_OOM events
+	 * should give us the glogbal ooms count.
+	 */
+	if (memcg == root_mem_cgroup)
+		seq_printf(m, "oom %lu\n", total_oom_kill - total_oom);
+	else
+		seq_printf(m, "oom %lu\n",
+			atomic_long_read(&memcg->memory_events[MEMCG_OOM]));
+
 	for (i = 0; i < NR_LRU_LISTS; i++)
 		seq_printf(m, "%s %lu\n", lru_list_name(i),
 			   memcg_page_state_local(memcg, NR_LRU_BASE + i) *
@@ -4181,6 +4211,11 @@ static int memcg_stat_show(struct seq_file *m, void *v)
 			   vm_event_name(memcg1_events[i]),
 			   (u64)memcg_events(memcg, memcg1_events[i]));
 
+	if (memcg == root_mem_cgroup)
+		seq_printf(m, "total_oom %lu\n", total_oom_kill);
+	else
+		seq_printf(m, "total_oom %lu\n", total_oom);
+
 	for (i = 0; i < NR_LRU_LISTS; i++)
 		seq_printf(m, "total_%s %llu\n", lru_list_name(i),
 			   (u64)memcg_page_state(memcg, NR_LRU_BASE + i) *


More information about the Devel mailing list