[Devel] [PATCH 10/17] oom: boost dying tasks on global oom
Kirill Tkhai
ktkhai at odin.com
Thu Sep 3 05:11:22 PDT 2015
On 03.09.2015 14:06, Kirill Tkhai wrote:
>
>
> On 03.09.2015 13:13, Vladimir Davydov wrote:
>> On Thu, Sep 03, 2015 at 01:09:36PM +0300, Kirill Tkhai wrote:
>>>
>>>
>>> On 14.08.2015 20:03, Vladimir Davydov wrote:
>>>> If an oom victim process has a low prio (nice or via cpu cgroup), it may
>>>> take it very long to complete, which is bad, because the system cannot
>>>> make progress until it dies. To avoid that, this patch makes oom killer
>>>> set victim task prio to the highest possible.
>>>>
>>>> It might be worth submitting this patch upstream. I will probably try.
>>>>
>>>> Signed-off-by: Vladimir Davydov <vdavydov at parallels.com>
>>>> ---
>>>> mm/oom_kill.c | 17 +++++++++++++++--
>>>> 1 file changed, 15 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
>>>> index 0e6f7535a565..ca765a82fa1a 100644
>>>> --- a/mm/oom_kill.c
>>>> +++ b/mm/oom_kill.c
>>>> @@ -294,6 +294,15 @@ enum oom_scan_t oom_scan_process_thread(struct task_struct *task,
>>>> return OOM_SCAN_OK;
>>>> }
>>>>
>>>> +static void boost_dying_task(struct task_struct *p)
>>>> +{
>>>> + /*
>>>> + * Set the dying task scheduling priority to the highest possible so
>>>> + * that it will die quickly irrespective of its scheduling policy.
>>>> + */
>>>> + sched_boost_task(p, 0);
>>>> +}
>>>> +
>>>> /*
>>>> * Simple selection loop. We chose the process with the highest
>>>> * number of 'points'.
>>>> @@ -321,6 +330,7 @@ static struct task_struct *select_bad_process(unsigned int *ppoints,
>>>> case OOM_SCAN_CONTINUE:
>>>> continue;
>>>> case OOM_SCAN_ABORT:
>>>> + boost_dying_task(p);
>>>
>>> This is potential livelock as you are holding at least try_set_zonelist_oom() bits locked
>>> and concurrent thread may use GFP_NOFAIL in __alloc_pages_slowpath(). This case it will be
>>> looping forever.
>>
>> It won't. There schedule_timeouts all over the place. Besides, if
>> try_set_zonelist_oom fails, the caller will call schedule_timeout.
>
> Really? What if a victim has signal_pending() flag?
>
> Even if it's not, you can't base on schedule_timeout(). No guarantees lock holder will be
> choosen for execution as at all.
Ah, schedule_timeout_uninterruptible() is there. So, it's OK. But guarantees are still absense...
>>>
>>> Furthermore, you manually do schedule_timeout_killable() in out_of_memory(), so this problem
>>> is a problem of !PREEMPTIBLE kernel too.
>>
>> I don't get this sentence. What's the problem?
>
> It's clarification to main problem, that it affects us.
>
>>>
>>> You mustn't leave processor before you're cleared the bits.
>>
>> Wrong, see above.
>>
More information about the Devel
mailing list