[Devel] [PATCH] mm: Change formula of calculation of default min_free_kbytes

Vasily Averin vvs at virtuozzo.com
Wed Dec 6 18:24:37 MSK 2017


it changes __init function.

On 2017-12-06 18:19, Konstantin Khorenko wrote:
> Please consider to RK this in ~2weeks of testing.
> 
> https://readykernel.com/
> 
> -- 
> Best regards,
> 
> Konstantin Khorenko,
> Virtuozzo Linux Kernel Team
> 
> On 11/08/2017 12:27 PM, Kirill Tkhai wrote:
>> Parameter min_free_kbytes acts on per zone watermarks. It is used
>> to calculate the zones free memory value, below which the direct
>> reclaim starts and becomes throttled (the called task sleeps).
>>
>> This patch makes default min_free_kbytes to be 2% of available
>> physical memory, but not more than 4GB. And this is more, than
>> previous formula gave (it was a sqrt). Why do we need that.
>>
>> We bumped in the situation, when intense disc write inside a CT
>> on a node, having very few free memory, may lead to the state,
>> when almost all tasks are spining in direct reclaim. The tasks
>> can't do effective reclaim as generated dirty pages are written
>> and released by ploop threads, and thus the tasks in practically
>> are just busy looping. Ploop threads can't produce the effective
>> reclaim, as processors are occupied by the busylooping tasks
>> and also they need free pages to do that. So, the system is
>> looping and becomes very slow and unresponsible.
>>
>> https://jira.sw.ru/browse/PSBM-69296
>>
>> Signed-off-by: Kirill Tkhai <ktkhai at virtuozzo.com>
>> ---
>>  mm/page_alloc.c |   27 +++------------------------
>>  1 file changed, 3 insertions(+), 24 deletions(-)
>>
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index 137d1d86ddf..2108034bd80 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -6399,27 +6399,6 @@ void setup_per_zone_wmarks(void)
>>
>>  /*
>>   * Initialise min_free_kbytes.
>> - *
>> - * For small machines we want it small (128k min).  For large machines
>> - * we want it large (64MB max).  But it is not linear, because network
>> - * bandwidth does not increase linearly with machine size.  We use
>> - *
>> - *     min_free_kbytes = 4 * sqrt(lowmem_kbytes), for better accuracy:
>> - *    min_free_kbytes = sqrt(lowmem_kbytes * 16)
>> - *
>> - * which yields
>> - *
>> - * 16MB:    512k
>> - * 32MB:    724k
>> - * 64MB:    1024k
>> - * 128MB:    1448k
>> - * 256MB:    2048k
>> - * 512MB:    2896k
>> - * 1024MB:    4096k
>> - * 2048MB:    5792k
>> - * 4096MB:    8192k
>> - * 8192MB:    11584k
>> - * 16384MB:    16384k
>>   */
>>  int __meminit init_per_zone_wmark_min(void)
>>  {
>> @@ -6427,14 +6406,14 @@ int __meminit init_per_zone_wmark_min(void)
>>      int new_min_free_kbytes;
>>
>>      lowmem_kbytes = nr_free_buffer_pages() * (PAGE_SIZE >> 10);
>> -    new_min_free_kbytes = int_sqrt(lowmem_kbytes * 16);
>> +    new_min_free_kbytes = lowmem_kbytes * 2 / 100; /* 2% */
>>
>>      if (new_min_free_kbytes > user_min_free_kbytes) {
>>          min_free_kbytes = new_min_free_kbytes;
>>          if (min_free_kbytes < 128)
>>              min_free_kbytes = 128;
>> -        if (min_free_kbytes > 65536)
>> -            min_free_kbytes = 65536;
>> +        if (min_free_kbytes > 4194304)
>> +            min_free_kbytes = 4194304;
>>      } else {
>>          pr_warn("min_free_kbytes is not updated to %d because user defined value %d is preferred\n",
>>                  new_min_free_kbytes, user_min_free_kbytes);
>>
>> .
>>
> 


More information about the Devel mailing list