[Devel] [PATCH] mm: Change formula of calculation of default min_free_kbytes

Kirill Tkhai ktkhai at virtuozzo.com
Wed Nov 8 12:27:14 MSK 2017


Parameter min_free_kbytes acts on per zone watermarks. It is used
to calculate the zones free memory value, below which the direct
reclaim starts and becomes throttled (the called task sleeps).

This patch makes default min_free_kbytes to be 2% of available
physical memory, but not more than 4GB. And this is more, than
previous formula gave (it was a sqrt). Why do we need that.

We bumped in the situation, when intense disc write inside a CT
on a node, having very few free memory, may lead to the state,
when almost all tasks are spining in direct reclaim. The tasks
can't do effective reclaim as generated dirty pages are written
and released by ploop threads, and thus the tasks in practically
are just busy looping. Ploop threads can't produce the effective
reclaim, as processors are occupied by the busylooping tasks
and also they need free pages to do that. So, the system is
looping and becomes very slow and unresponsible.

https://jira.sw.ru/browse/PSBM-69296

Signed-off-by: Kirill Tkhai <ktkhai at virtuozzo.com>
---
 mm/page_alloc.c |   27 +++------------------------
 1 file changed, 3 insertions(+), 24 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 137d1d86ddf..2108034bd80 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -6399,27 +6399,6 @@ void setup_per_zone_wmarks(void)
 
 /*
  * Initialise min_free_kbytes.
- *
- * For small machines we want it small (128k min).  For large machines
- * we want it large (64MB max).  But it is not linear, because network
- * bandwidth does not increase linearly with machine size.  We use
- *
- * 	min_free_kbytes = 4 * sqrt(lowmem_kbytes), for better accuracy:
- *	min_free_kbytes = sqrt(lowmem_kbytes * 16)
- *
- * which yields
- *
- * 16MB:	512k
- * 32MB:	724k
- * 64MB:	1024k
- * 128MB:	1448k
- * 256MB:	2048k
- * 512MB:	2896k
- * 1024MB:	4096k
- * 2048MB:	5792k
- * 4096MB:	8192k
- * 8192MB:	11584k
- * 16384MB:	16384k
  */
 int __meminit init_per_zone_wmark_min(void)
 {
@@ -6427,14 +6406,14 @@ int __meminit init_per_zone_wmark_min(void)
 	int new_min_free_kbytes;
 
 	lowmem_kbytes = nr_free_buffer_pages() * (PAGE_SIZE >> 10);
-	new_min_free_kbytes = int_sqrt(lowmem_kbytes * 16);
+	new_min_free_kbytes = lowmem_kbytes * 2 / 100; /* 2% */
 
 	if (new_min_free_kbytes > user_min_free_kbytes) {
 		min_free_kbytes = new_min_free_kbytes;
 		if (min_free_kbytes < 128)
 			min_free_kbytes = 128;
-		if (min_free_kbytes > 65536)
-			min_free_kbytes = 65536;
+		if (min_free_kbytes > 4194304)
+			min_free_kbytes = 4194304;
 	} else {
 		pr_warn("min_free_kbytes is not updated to %d because user defined value %d is preferred\n",
 				new_min_free_kbytes, user_min_free_kbytes);



More information about the Devel mailing list