[Devel] [PATCH] mm: Change formula of calculation of default min_free_kbytes

Konstantin Khorenko khorenko at virtuozzo.com
Wed Dec 6 18:19:26 MSK 2017


Please consider to RK this in ~2weeks of testing.

https://readykernel.com/

--
Best regards,

Konstantin Khorenko,
Virtuozzo Linux Kernel Team

On 11/08/2017 12:27 PM, Kirill Tkhai wrote:
> Parameter min_free_kbytes acts on per zone watermarks. It is used
> to calculate the zones free memory value, below which the direct
> reclaim starts and becomes throttled (the called task sleeps).
>
> This patch makes default min_free_kbytes to be 2% of available
> physical memory, but not more than 4GB. And this is more, than
> previous formula gave (it was a sqrt). Why do we need that.
>
> We bumped in the situation, when intense disc write inside a CT
> on a node, having very few free memory, may lead to the state,
> when almost all tasks are spining in direct reclaim. The tasks
> can't do effective reclaim as generated dirty pages are written
> and released by ploop threads, and thus the tasks in practically
> are just busy looping. Ploop threads can't produce the effective
> reclaim, as processors are occupied by the busylooping tasks
> and also they need free pages to do that. So, the system is
> looping and becomes very slow and unresponsible.
>
> https://jira.sw.ru/browse/PSBM-69296
>
> Signed-off-by: Kirill Tkhai <ktkhai at virtuozzo.com>
> ---
>  mm/page_alloc.c |   27 +++------------------------
>  1 file changed, 3 insertions(+), 24 deletions(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 137d1d86ddf..2108034bd80 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -6399,27 +6399,6 @@ void setup_per_zone_wmarks(void)
>
>  /*
>   * Initialise min_free_kbytes.
> - *
> - * For small machines we want it small (128k min).  For large machines
> - * we want it large (64MB max).  But it is not linear, because network
> - * bandwidth does not increase linearly with machine size.  We use
> - *
> - * 	min_free_kbytes = 4 * sqrt(lowmem_kbytes), for better accuracy:
> - *	min_free_kbytes = sqrt(lowmem_kbytes * 16)
> - *
> - * which yields
> - *
> - * 16MB:	512k
> - * 32MB:	724k
> - * 64MB:	1024k
> - * 128MB:	1448k
> - * 256MB:	2048k
> - * 512MB:	2896k
> - * 1024MB:	4096k
> - * 2048MB:	5792k
> - * 4096MB:	8192k
> - * 8192MB:	11584k
> - * 16384MB:	16384k
>   */
>  int __meminit init_per_zone_wmark_min(void)
>  {
> @@ -6427,14 +6406,14 @@ int __meminit init_per_zone_wmark_min(void)
>  	int new_min_free_kbytes;
>
>  	lowmem_kbytes = nr_free_buffer_pages() * (PAGE_SIZE >> 10);
> -	new_min_free_kbytes = int_sqrt(lowmem_kbytes * 16);
> +	new_min_free_kbytes = lowmem_kbytes * 2 / 100; /* 2% */
>
>  	if (new_min_free_kbytes > user_min_free_kbytes) {
>  		min_free_kbytes = new_min_free_kbytes;
>  		if (min_free_kbytes < 128)
>  			min_free_kbytes = 128;
> -		if (min_free_kbytes > 65536)
> -			min_free_kbytes = 65536;
> +		if (min_free_kbytes > 4194304)
> +			min_free_kbytes = 4194304;
>  	} else {
>  		pr_warn("min_free_kbytes is not updated to %d because user defined value %d is preferred\n",
>  				new_min_free_kbytes, user_min_free_kbytes);
>
> .
>


More information about the Devel mailing list