[Devel] [PATCH RH7] ms/memcg: prohibit unconditional exceeding the limit of dying tasks

Vasily Averin vvs at virtuozzo.com
Sat Sep 11 23:05:46 MSK 2021


The kernel currently allows dying tasks to exceed the memcg limits.
The allocation is expected to be the last one and the occupied memory
will be freed soon.
This is not always true because it can be part of the huge vmalloc
allocation. Allowed once, they will repeat over and over again.
Moreover lifetime of the allocated object can differ from
In addition the lifetime of the dying task.
Multiple such allocations running concurrently can not only overuse
the memcg limit, but can lead to a global out of memory and,
in the worst case, cause the host to panic.

[backport of upstream patch version]
Link: https://lkml.org/lkml/2021/9/10/374
https://jira.sw.ru/browse/PSBM-132705
Signed-off-by: Vasily Averin <vvs at virtuozzo.com>
---
 mm/memcontrol.c | 14 --------------
 1 file changed, 14 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 8dbd140..2ede4d2 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3343,17 +3343,6 @@ static int try_charge(struct mem_cgroup *memcg, gfp_t gfp_mask, bool kmem_charge
 	}
 
 	/*
-	 * Unlike in global OOM situations, memcg is not in a physical
-	 * memory shortage.  Allow dying and OOM-killed tasks to
-	 * bypass the last charges so that they can exit quickly and
-	 * free their memory.
-	 */
-	if (unlikely(test_thread_flag(TIF_MEMDIE) ||
-		     fatal_signal_pending(current) ||
-		     current->flags & PF_EXITING))
-		goto bypass;
-
-	/*
 	 * Prevent unbounded recursion when reclaim operations need to
 	 * allocate memory. This might exceed the limits temporarily,
 	 * but we prefer facilitating memory reclaim and getting back
@@ -3407,9 +3396,6 @@ static int try_charge(struct mem_cgroup *memcg, gfp_t gfp_mask, bool kmem_charge
 	if (gfp_mask & __GFP_NOFAIL)
 		goto bypass;
 
-	if (fatal_signal_pending(current))
-		goto bypass;
-
 	/*
 	 * We might have [a lot of] reclaimable kmem which we cannot reclaim in
 	 * the current context, e.g. lot of inodes/dentries while tring to get
-- 
1.8.3.1



More information about the Devel mailing list