[Devel] [PATCH vz9 11/16] mm, memcg: add oom counter to memory.stat memcgroup file

Nikita Yushchenko nikita.yushchenko at virtuozzo.com
Wed Sep 29 10:00:12 MSK 2021


From: Andrey Ryabinin <aryabinin at virtuozzo.com>

Add oom counter to memory.stat file. oom shows amount of oom kills
triggered due to cgroup's memory limit. total_oom shows total sum of
oom kills triggered due to cgroup's and it's sub-groups memory limits.

memory.stat in the root cgroup counts global oom kills.

E.g:
 # mkdir /sys/fs/cgroup/memory/test/
 # echo 100M > /sys/fs/cgroup/memory/test/memory.limit_in_bytes
 # echo 100M > /sys/fs/cgroup/memory/test/memory.memsw.limit_in_bytes
 # echo $$ > /sys/fs/cgroup/memory/test/tasks
 # ./vm-scalability/usemem -O 200M
 # grep oom /sys/fs/cgroup/memory/test/memory.stat
   oom 1
   total_oom 1
 # echo -1 > /sys/fs/cgroup/memory/test/memory.memsw.limit_in_bytes
 # echo -1 > /sys/fs/cgroup/memory/test/memory.limit_in_bytes
 # ./vm-scalability/usemem -O 1000G
 # grep oom /sys/fs/cgroup/memory/memory.stat
    oom 1
    total_oom 2

https://jira.sw.ru/browse/PSBM-108287
Signed-off-by: Andrey Ryabinin <aryabinin at virtuozzo.com>

khorenko@ notes:
1) non-root memcg:
   * oom - number of OOMs caused by limit exceeded by this particular memcg
   * total_oom - number of OOMs caused by limit exceeded by this particular
     memcg and all nested memory cgroups.

   Note: as the current memcg is not root, then it's "local" OOM,
   thus processes were killed in the same memcg which caused it.

2) root memcg:
   * oom - the number of global OOMs happened
   * total_oom - the number of global OOMs + number of OOMs caused by limit
     exceeded in all nested memory cgroups

Note: root memory cgroup cannot be limited => no OOMs can be cause by its limit.

+++
mm, memcg: Fix "add oom counter to memory.stat memcgroup file"

Fix rebase of commit 3f10e0c1a0df12a2a503d0d9a3ec7b4f3ac3a467
	Author: Andrey Ryabinin <aryabinin at virtuozzo.com>
	Date: Mon Oct 5 13:18:40 2020 +0300

	mm, memcg: add oom counter to memory.stat memcgroup file

https://jira.sw.ru/browse/PSBM-123537
Signed-off-by: Andrey Ryabinin <aryabinin at virtuozzo.com>

+++
mm/memcontrol: fix oom counting

Fix oom_total in /sys/fs/cgroup/memory/memory.stat being always zero due to
incrementing pointer instead of value in accumulate_ooms().

Drop cond_resched() - looks like an overkill to call it after each two
atomic_long_read()

mFixes: 64e8dc809dd9 ("mm, memcg: Fix "add oom counter to memory.stat memcgroup
file"")
Signed-off-by: Andrey Zhadchenko <andrey.zhadchenko at virtuozzo.com>

(cherry-picked from vz8 commit 96d0009464f0 ("mm, memcg: add oom counter to
memory.stat memcgroup file"))

Signed-off-by: Nikita Yushchenko <nikita.yushchenko at virtuozzo.com>
---
 mm/memcontrol.c | 35 +++++++++++++++++++++++++++++++++++
 1 file changed, 35 insertions(+)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 65dec706430d..e92919d8ce06 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -4122,12 +4122,28 @@ static const unsigned int memcg1_events[] = {
 	PSWPOUT,
 };
 
+static void accumulate_ooms(struct mem_cgroup *memcg, unsigned long *oom,
+			unsigned long *kill)
+{
+	struct mem_cgroup *mi;
+	unsigned long total_oom_kill = 0, total_oom = 0;
+
+	for_each_mem_cgroup_tree(mi, memcg) {
+		total_oom += atomic_long_read(&mi->memory_events[MEMCG_OOM]);
+		total_oom_kill += atomic_long_read(&mi->memory_events[MEMCG_OOM_KILL]);
+	}
+
+	*oom = total_oom;
+	*kill = total_oom_kill;
+}
+
 static int memcg_stat_show(struct seq_file *m, void *v)
 {
 	struct mem_cgroup *memcg = mem_cgroup_from_seq(m);
 	unsigned long memory, memsw;
 	struct mem_cgroup *mi;
 	unsigned int i;
+	unsigned long total_oom = 0, total_oom_kill = 0;
 
 	BUILD_BUG_ON(ARRAY_SIZE(memcg1_stat_names) != ARRAY_SIZE(memcg1_stats));
 
@@ -4148,6 +4164,20 @@ static int memcg_stat_show(struct seq_file *m, void *v)
 		seq_printf(m, "%s %lu\n", vm_event_name(memcg1_events[i]),
 			   memcg_events_local(memcg, memcg1_events[i]));
 
+
+	accumulate_ooms(memcg, &total_oom, &total_oom_kill);
+
+	/*
+	 * For root_mem_cgroup we want to account global ooms as well.
+	 * The diff between all MEMCG_OOM_KILL and MEMCG_OOM events
+	 * should give us the glogbal ooms count.
+	 */
+	if (memcg == root_mem_cgroup)
+		seq_printf(m, "oom %lu\n", total_oom_kill - total_oom);
+	else
+		seq_printf(m, "oom %lu\n",
+			atomic_long_read(&memcg->memory_events[MEMCG_OOM]));
+
 	for (i = 0; i < NR_LRU_LISTS; i++)
 		seq_printf(m, "%s %lu\n", lru_list_name(i),
 			   memcg_page_state_local(memcg, NR_LRU_BASE + i) *
@@ -4181,6 +4211,11 @@ static int memcg_stat_show(struct seq_file *m, void *v)
 			   vm_event_name(memcg1_events[i]),
 			   (u64)memcg_events(memcg, memcg1_events[i]));
 
+	if (memcg == root_mem_cgroup)
+		seq_printf(m, "total_oom %lu\n", total_oom_kill);
+	else
+		seq_printf(m, "total_oom %lu\n", total_oom);
+
 	for (i = 0; i < NR_LRU_LISTS; i++)
 		seq_printf(m, "total_%s %llu\n", lru_list_name(i),
 			   (u64)memcg_page_state(memcg, NR_LRU_BASE + i) *
-- 
2.30.2



More information about the Devel mailing list