[Devel] [PATCH VZ9 3/6] ms/mm, oom: pagefault_out_of_memory: don't force global OOM for dying tasks

Vasily Averin vvs at virtuozzo.com
Tue Nov 9 13:45:13 MSK 2021


Patch series "memcg: prohibit unconditional exceeding the limit of dying tasks", v3.

Memory cgroup charging allows killed or exiting tasks to exceed the hard
limit.  It can be misused and allowed to trigger global OOM from inside
a memcg-limited container.  On the other hand if memcg fails allocation,
called from inside #PF handler it triggers global OOM from inside
pagefault_out_of_memory().

To prevent these problems this patchset:
 (a) removes execution of out_of_memory() from
     pagefault_out_of_memory(), becasue nobody can explain why it is
     necessary.
 (b) allow memcg to fail allocation of dying/killed tasks.

This patch (of 3):

Any allocation failure during the #PF path will return with VM_FAULT_OOM
which in turn results in pagefault_out_of_memory which in turn executes
out_out_memory() and can kill a random task.

An allocation might fail when the current task is the oom victim and
there are no memory reserves left.  The OOM killer is already handled at
the page allocator level for the global OOM and at the charging level
for the memcg one.  Both have much more information about the scope of
allocation/charge request.  This means that either the OOM killer has
been invoked properly and didn't lead to the allocation success or it
has been skipped because it couldn't have been invoked.  In both cases
triggering it from here is pointless and even harmful.

It makes much more sense to let the killed task die rather than to wake
up an eternally hungry oom-killer and send him to choose a fatter victim
for breakfast.

Link: https://lkml.kernel.org/r/0828a149-786e-7c06-b70a-52d086818ea3@virtuozzo.com
Signed-off-by: Vasily Averin <vvs at virtuozzo.com>
Suggested-by: Michal Hocko <mhocko at suse.com>
Acked-by: Michal Hocko <mhocko at suse.com>
Cc: Johannes Weiner <hannes at cmpxchg.org>
Cc: Mel Gorman <mgorman at techsingularity.net>
Cc: Roman Gushchin <guro at fb.com>
Cc: Shakeel Butt <shakeelb at google.com>
Cc: Tetsuo Handa <penguin-kernel at i-love.sakura.ne.jp>
Cc: Uladzislau Rezki <urezki at gmail.com>
Cc: Vladimir Davydov <vdavydov.dev at gmail.com>
Cc: Vlastimil Babka <vbabka at suse.cz>
Cc: <stable at vger.kernel.org>
Signed-off-by: Andrew Morton <akpm at linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds at linux-foundation.org>

https://jira.sw.ru/browse/PSBM-134774
(cherry picked from commit 0b28179a6138a5edd9d82ad2687c05b3773c387b)
Signed-off-by: Vasily Averin <vvs at virtuozzo.com>
---
 mm/oom_kill.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index f603a954a646..1eeea2900828 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -1260,6 +1260,9 @@ void pagefault_out_of_memory(void)
 	if (mem_cgroup_oom_synchronize(true))
 		return;
 
+	if (fatal_signal_pending(current))
+		return;
+
 	if (!mutex_trylock(&oom_lock))
 		return;
 	out_of_memory(&oc);
-- 
2.25.1



More information about the Devel mailing list