[Devel] [PATCH RHEL7 COMMIT] mm: Change formula of calculation of default min_free_kbytes
Konstantin Khorenko
khorenko at virtuozzo.com
Wed Dec 6 18:13:09 MSK 2017
The commit is pushed to "branch-rh7-3.10.0-693.11.1.vz7.39.x-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git
after rh7-3.10.0-693.11.1.vz7.39.1
------>
commit 1de6694f6a766df4fddebc206b7754cd468eb991
Author: Kirill Tkhai <ktkhai at virtuozzo.com>
Date: Wed Dec 6 18:13:08 2017 +0300
mm: Change formula of calculation of default min_free_kbytes
Parameter min_free_kbytes acts on per zone watermarks. It is used
to calculate the zones free memory value, below which the direct
reclaim starts and becomes throttled (the called task sleeps).
This patch makes default min_free_kbytes to be 2% of available
physical memory, but not more than 4GB. And this is more, than
previous formula gave (it was a sqrt). Why do we need that.
We bumped in the situation, when intense disc write inside a CT
on a node, having very few free memory, may lead to the state,
when almost all tasks are spining in direct reclaim. The tasks
can't do effective reclaim as generated dirty pages are written
and released by ploop threads, and thus the tasks in practically
are just busy looping. Ploop threads can't produce the effective
reclaim, as processors are occupied by the busylooping tasks
and also they need free pages to do that. So, the system is
looping and becomes very slow and unresponsible.
https://jira.sw.ru/browse/PSBM-69296
Signed-off-by: Kirill Tkhai <ktkhai at virtuozzo.com>
---
mm/page_alloc.c | 27 +++------------------------
1 file changed, 3 insertions(+), 24 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index f2b7f49493f8..40700c3bd133 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -6399,27 +6399,6 @@ void setup_per_zone_wmarks(void)
/*
* Initialise min_free_kbytes.
- *
- * For small machines we want it small (128k min). For large machines
- * we want it large (64MB max). But it is not linear, because network
- * bandwidth does not increase linearly with machine size. We use
- *
- * min_free_kbytes = 4 * sqrt(lowmem_kbytes), for better accuracy:
- * min_free_kbytes = sqrt(lowmem_kbytes * 16)
- *
- * which yields
- *
- * 16MB: 512k
- * 32MB: 724k
- * 64MB: 1024k
- * 128MB: 1448k
- * 256MB: 2048k
- * 512MB: 2896k
- * 1024MB: 4096k
- * 2048MB: 5792k
- * 4096MB: 8192k
- * 8192MB: 11584k
- * 16384MB: 16384k
*/
int __meminit init_per_zone_wmark_min(void)
{
@@ -6427,14 +6406,14 @@ int __meminit init_per_zone_wmark_min(void)
int new_min_free_kbytes;
lowmem_kbytes = nr_free_buffer_pages() * (PAGE_SIZE >> 10);
- new_min_free_kbytes = int_sqrt(lowmem_kbytes * 16);
+ new_min_free_kbytes = lowmem_kbytes * 2 / 100; /* 2% */
if (new_min_free_kbytes > user_min_free_kbytes) {
min_free_kbytes = new_min_free_kbytes;
if (min_free_kbytes < 128)
min_free_kbytes = 128;
- if (min_free_kbytes > 65536)
- min_free_kbytes = 65536;
+ if (min_free_kbytes > 4194304)
+ min_free_kbytes = 4194304;
} else {
pr_warn("min_free_kbytes is not updated to %d because user defined value %d is preferred\n",
new_min_free_kbytes, user_min_free_kbytes);
More information about the Devel
mailing list