[Devel] [PATCH rh7 1/2] oom: Do not mark victim a task without mm
Konstantin Khorenko
khorenko at virtuozzo.com
Mon Nov 28 19:17:42 MSK 2022
--
Best regards,
Konstantin Khorenko,
Virtuozzo Linux Kernel Team
On 26.11.2022 23:15, nb wrote:
>
>
> On 23.11.22 г. 20:53 ч., Konstantin Khorenko wrote:
>> Currently it's possible to mark a task as a victim even in case it has
>> already cleared its ->mm.
>>
>> This might lead (and leads) to a situation when oom_unlock() believes
>> the OOM context will be released by the "victim" do_exit() ->
>> exit_oom_victim(), but our "victim" already passed the point of calling
>> exit_oom_victim() and thus OOM context is not released.
>>
>> Add additional checks for task->mm in appropriate places, similar checks
>> are applied in mainstream code in the scope of:
>> 1af8bb432695 ("mm, oom: fortify task_will_free_mem()")
>> 091f362c53c2 ("mm, oom: tighten task_will_free_mem() locking")
>>
>> https://jira.sw.ru/browse/PSBM-143283
>>
>> Signed-off-by: Denis Lunev <den at virtuozzo.com>
>> Signed-off-by: Konstantin Khorenko <khorenko at virtuozzo.com>
>> ---
>> include/linux/oom.h | 14 ++++++++++++++
>> mm/memcontrol.c | 7 ++++++-
>> 2 files changed, 20 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/linux/oom.h b/include/linux/oom.h
>> index 3a6e073a5dd4..ef0096799ee3 100644
>> --- a/include/linux/oom.h
>> +++ b/include/linux/oom.h
>> @@ -125,8 +125,22 @@ static inline void oom_killer_enable(void)
>>
>> extern struct task_struct *find_lock_task_mm(struct task_struct *p);
>>
>> +/*
>> + * Caller has to make sure that task->mm is stable (hold task_lock or
>> + * it operates on the current).
>
> nit:
>
> Instead of spelling this in comment I find it more robust if such
> conditions are enforced via lockdep. For example :
>
> lockdep_assert(lockdep_is_held(l) != LOCK_STATE_NOT_HELD || task == current)
Thank you for the nit, addressed it in the
[PATCH rh7 4/4] oom: Cast the lockdep spell in the code
> Of course that's only helpful if we run tests with lockdep enabled
> (which I don't know if we do ? )
Sure we do build debug kernels (with lockdep enabled) and test them.
>> + */
>> static inline bool task_will_free_mem(struct task_struct *task)
>> {
>> + struct mm_struct *mm = task->mm;
>> +
>> + /*
>> + * Skip tasks without mm because it might have passed its exit_mm and
>> + * exit_oom_victim. oom_reaper could have rescued that but do not rely
>> + * on that for now. We can consider find_lock_task_mm in future.
>> + */
>> + if (!mm)
>> + return false;
>> +
>> /*
>> * A coredumping process may sleep for an extended period in exit_mm(),
>> * so the oom killer cannot assume that the process will promptly exit
>> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
>> index fdc5245e48a9..7135306c6ac0 100644
>> --- a/mm/memcontrol.c
>> +++ b/mm/memcontrol.c
>> @@ -2492,8 +2492,13 @@ static void mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask,
>> * If current has a pending SIGKILL or is exiting, then automatically
>> * select it. The goal is to allow it to allocate so that it may
>> * quickly exit and free its memory.
>> + *
>> + * But don't select if current has already released its mm at
>> + * exit_mm(), otherwise we might skip exit_oom_victim() and
>> + * thus OOM context won't be released.
>> */
>> - if (fatal_signal_pending(current) || task_will_free_mem(current)) {
>> + if (current->mm &&
>> + (fatal_signal_pending(current) || task_will_free_mem(current))) {
>> mark_oom_victim(current);
>> return;
>> }
More information about the Devel
mailing list