[Devel] [PATCH 10/17] oom: boost dying tasks on global oom
Kirill Tkhai
ktkhai at odin.com
Thu Sep 3 04:06:08 PDT 2015
On 03.09.2015 13:13, Vladimir Davydov wrote:
> On Thu, Sep 03, 2015 at 01:09:36PM +0300, Kirill Tkhai wrote:
>>
>>
>> On 14.08.2015 20:03, Vladimir Davydov wrote:
>>> If an oom victim process has a low prio (nice or via cpu cgroup), it may
>>> take it very long to complete, which is bad, because the system cannot
>>> make progress until it dies. To avoid that, this patch makes oom killer
>>> set victim task prio to the highest possible.
>>>
>>> It might be worth submitting this patch upstream. I will probably try.
>>>
>>> Signed-off-by: Vladimir Davydov <vdavydov at parallels.com>
>>> ---
>>> mm/oom_kill.c | 17 +++++++++++++++--
>>> 1 file changed, 15 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
>>> index 0e6f7535a565..ca765a82fa1a 100644
>>> --- a/mm/oom_kill.c
>>> +++ b/mm/oom_kill.c
>>> @@ -294,6 +294,15 @@ enum oom_scan_t oom_scan_process_thread(struct task_struct *task,
>>> return OOM_SCAN_OK;
>>> }
>>>
>>> +static void boost_dying_task(struct task_struct *p)
>>> +{
>>> + /*
>>> + * Set the dying task scheduling priority to the highest possible so
>>> + * that it will die quickly irrespective of its scheduling policy.
>>> + */
>>> + sched_boost_task(p, 0);
>>> +}
>>> +
>>> /*
>>> * Simple selection loop. We chose the process with the highest
>>> * number of 'points'.
>>> @@ -321,6 +330,7 @@ static struct task_struct *select_bad_process(unsigned int *ppoints,
>>> case OOM_SCAN_CONTINUE:
>>> continue;
>>> case OOM_SCAN_ABORT:
>>> + boost_dying_task(p);
>>
>> This is potential livelock as you are holding at least try_set_zonelist_oom() bits locked
>> and concurrent thread may use GFP_NOFAIL in __alloc_pages_slowpath(). This case it will be
>> looping forever.
>
> It won't. There schedule_timeouts all over the place. Besides, if
> try_set_zonelist_oom fails, the caller will call schedule_timeout.
Really? What if a victim has signal_pending() flag?
Even if it's not, you can't base on schedule_timeout(). No guarantees lock holder will be
choosen for execution as at all.
>>
>> Furthermore, you manually do schedule_timeout_killable() in out_of_memory(), so this problem
>> is a problem of !PREEMPTIBLE kernel too.
>
> I don't get this sentence. What's the problem?
It's clarification to main problem, that it affects us.
>>
>> You mustn't leave processor before you're cleared the bits.
>
> Wrong, see above.
>
More information about the Devel
mailing list