[Devel] [PATCH RHEL7 COMMIT] ms/mm/memcontrol.c: try harder to decrease [memory, memsw].limit_in_bytes

Konstantin Khorenko khorenko at virtuozzo.com
Wed Jan 31 18:48:04 MSK 2018


The commit is pushed to "branch-rh7-3.10.0-693.11.6.vz7.42.x-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git
after rh7-3.10.0-693.11.6.vz7.42.4
------>
commit 87ebda94dd4d64ee0c818579cd918db692245d37
Author: Andrey Ryabinin <aryabinin at virtuozzo.com>
Date:   Wed Jan 31 18:48:04 2018 +0300

    ms/mm/memcontrol.c: try harder to decrease [memory,memsw].limit_in_bytes
    
    mem_cgroup_resize_[memsw]_limit() tries to free only 32 (SWAP_CLUSTER_MAX)
    pages on each iteration.  This makes it practically impossible to decrease
    limit of memory cgroup.  Tasks could easily allocate back 32 pages, so we
    can't reduce memory usage, and once retry_count reaches zero we return
    -EBUSY.
    
    Easy to reproduce the problem by running the following commands:
    
      mkdir /sys/fs/cgroup/memory/test
      echo $$ >> /sys/fs/cgroup/memory/test/tasks
      cat big_file > /dev/null &
      sleep 1 && echo $((100*1024*1024)) > /sys/fs/cgroup/memory/test/memory.limit_in_bytes
      -bash: echo: write error: Device or resource busy
    
    Instead of relying on retry_count, keep retrying the reclaim until the
    desired limit is reached or fail if the reclaim doesn't make any progress
    or a signal is pending.
    
    https://jira.sw.ru/browse/PSBM-80732
    
    Link: http://lkml.kernel.org/r/20180119132544.19569-1-aryabinin@virtuozzo.com
    Signed-off-by: Andrey Ryabinin <aryabinin at virtuozzo.com>
    
    Acked-by: Michal Hocko <mhocko at suse.com>
    Reviewed-by: Andrew Morton <akpm at linux-foundation.org>
    Cc: Shakeel Butt <shakeelb at google.com>
    Cc: Johannes Weiner <hannes at cmpxchg.org>
    Cc: Vladimir Davydov <vdavydov.dev at gmail.com>
    Signed-off-by: Andrew Morton <akpm at linux-foundation.org>
---
 mm/memcontrol.c | 43 ++++++-------------------------------------
 1 file changed, 6 insertions(+), 37 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 99cabf41bc9f..325dee2cd903 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2055,20 +2055,6 @@ void mem_cgroup_print_oom_info(struct mem_cgroup *memcg, struct task_struct *p)
 	mutex_unlock(&oom_info_lock);
 }
 
-/*
- * This function returns the number of memcg under hierarchy tree. Returns
- * 1(self count) if no children.
- */
-static int mem_cgroup_count_children(struct mem_cgroup *memcg)
-{
-	int num = 0;
-	struct mem_cgroup *iter;
-
-	for_each_mem_cgroup_tree(iter, memcg)
-		num++;
-	return num;
-}
-
 /*
  * Return the memory (and swap, if configured) limit for a memcg.
  */
@@ -3802,24 +3788,11 @@ void mem_cgroup_print_bad_page(struct page *page)
 static int mem_cgroup_resize_limit(struct mem_cgroup *memcg,
 				   unsigned long limit, bool memsw)
 {
-	unsigned long curusage;
-	unsigned long oldusage;
 	bool enlarge = false;
-	int retry_count;
 	int ret;
 	bool limits_invariant;
 	struct page_counter *counter = memsw ? &memcg->memsw : &memcg->memory;
 
-	/*
-	 * For keeping hierarchical_reclaim simple, how long we should retry
-	 * is depends on callers. We set our retry-count to be function
-	 * of # of children which we should visit in this loop.
-	 */
-	retry_count = MEM_CGROUP_RECLAIM_RETRIES *
-		      mem_cgroup_count_children(memcg);
-
-	oldusage = page_counter_read(counter);
-
 	do {
 		if (signal_pending(current)) {
 			ret = -EINTR;
@@ -3845,16 +3818,12 @@ static int mem_cgroup_resize_limit(struct mem_cgroup *memcg,
 		if (!ret)
 			break;
 
-		try_to_free_mem_cgroup_pages(memcg, 1, GFP_KERNEL,
-					memsw ? MEM_CGROUP_RECLAIM_NOSWAP : 0);
-
-		curusage = page_counter_read(counter);
-		/* Usage is reduced ? */
-  		if (curusage >= oldusage)
-			retry_count--;
-		else
-			oldusage = curusage;
-	} while (retry_count);
+		if (!try_to_free_mem_cgroup_pages(memcg, 1, GFP_KERNEL,
+				memsw ? MEM_CGROUP_RECLAIM_NOSWAP : 0)) {
+			ret = -EBUSY;
+			break;
+		}
+	} while (true);
 
 	if (!ret && enlarge)
 		memcg_oom_recover(memcg);


More information about the Devel mailing list