[Devel] [PATCH] mm: Change formula of calculation of default min_free_kbytes
Vasily Averin
vvs at virtuozzo.com
Wed Dec 6 18:24:37 MSK 2017
it changes __init function.
On 2017-12-06 18:19, Konstantin Khorenko wrote:
> Please consider to RK this in ~2weeks of testing.
>
> https://readykernel.com/
>
> --
> Best regards,
>
> Konstantin Khorenko,
> Virtuozzo Linux Kernel Team
>
> On 11/08/2017 12:27 PM, Kirill Tkhai wrote:
>> Parameter min_free_kbytes acts on per zone watermarks. It is used
>> to calculate the zones free memory value, below which the direct
>> reclaim starts and becomes throttled (the called task sleeps).
>>
>> This patch makes default min_free_kbytes to be 2% of available
>> physical memory, but not more than 4GB. And this is more, than
>> previous formula gave (it was a sqrt). Why do we need that.
>>
>> We bumped in the situation, when intense disc write inside a CT
>> on a node, having very few free memory, may lead to the state,
>> when almost all tasks are spining in direct reclaim. The tasks
>> can't do effective reclaim as generated dirty pages are written
>> and released by ploop threads, and thus the tasks in practically
>> are just busy looping. Ploop threads can't produce the effective
>> reclaim, as processors are occupied by the busylooping tasks
>> and also they need free pages to do that. So, the system is
>> looping and becomes very slow and unresponsible.
>>
>> https://jira.sw.ru/browse/PSBM-69296
>>
>> Signed-off-by: Kirill Tkhai <ktkhai at virtuozzo.com>
>> ---
>> mm/page_alloc.c | 27 +++------------------------
>> 1 file changed, 3 insertions(+), 24 deletions(-)
>>
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index 137d1d86ddf..2108034bd80 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -6399,27 +6399,6 @@ void setup_per_zone_wmarks(void)
>>
>> /*
>> * Initialise min_free_kbytes.
>> - *
>> - * For small machines we want it small (128k min). For large machines
>> - * we want it large (64MB max). But it is not linear, because network
>> - * bandwidth does not increase linearly with machine size. We use
>> - *
>> - * min_free_kbytes = 4 * sqrt(lowmem_kbytes), for better accuracy:
>> - * min_free_kbytes = sqrt(lowmem_kbytes * 16)
>> - *
>> - * which yields
>> - *
>> - * 16MB: 512k
>> - * 32MB: 724k
>> - * 64MB: 1024k
>> - * 128MB: 1448k
>> - * 256MB: 2048k
>> - * 512MB: 2896k
>> - * 1024MB: 4096k
>> - * 2048MB: 5792k
>> - * 4096MB: 8192k
>> - * 8192MB: 11584k
>> - * 16384MB: 16384k
>> */
>> int __meminit init_per_zone_wmark_min(void)
>> {
>> @@ -6427,14 +6406,14 @@ int __meminit init_per_zone_wmark_min(void)
>> int new_min_free_kbytes;
>>
>> lowmem_kbytes = nr_free_buffer_pages() * (PAGE_SIZE >> 10);
>> - new_min_free_kbytes = int_sqrt(lowmem_kbytes * 16);
>> + new_min_free_kbytes = lowmem_kbytes * 2 / 100; /* 2% */
>>
>> if (new_min_free_kbytes > user_min_free_kbytes) {
>> min_free_kbytes = new_min_free_kbytes;
>> if (min_free_kbytes < 128)
>> min_free_kbytes = 128;
>> - if (min_free_kbytes > 65536)
>> - min_free_kbytes = 65536;
>> + if (min_free_kbytes > 4194304)
>> + min_free_kbytes = 4194304;
>> } else {
>> pr_warn("min_free_kbytes is not updated to %d because user defined value %d is preferred\n",
>> new_min_free_kbytes, user_min_free_kbytes);
>>
>> .
>>
>
More information about the Devel
mailing list