[Devel] [PATCH vz9] sched: Do not set LBF_NEED_BREAK flag if scanned all the tasks
Alexander Atanasov
alexander.atanasov at virtuozzo.com
Thu Dec 14 16:39:23 MSK 2023
On 14.12.23 0:01, Konstantin Khorenko wrote:
> After ms commit b0defa7ae03e ("sched/fair: Make sure to try to detach at
> least one movable task") detach_tasks() does not stop on the condition
> (env->loop > env->loop_max) in case no movable task found.
>
> Instead of that (if there are no movable tasks in the rq) exits always
> happen on the loop_break check - thus with LBF_NEED_BREAK flag set.
>
> It's not a problem for mainstream because load_balance() proceeds with
> balancing in case LBF_NEED_BREAK is set only in case (env.loop <
> busiest->nr_running), but in Virtuozzo kernel with CFS_CPULIMIT feature
> right before that we try to move a whole task group (object of the
> CFS_CPULIMIT feature) and resets env.loop to zero,
> so we get a livelock here.
>
> Resetting env.loop makes sense in case we have successfully moved some
> tasks (in the scope of the task group), but if we failed to move any
> task, no progress is expected during further balancing attempts.
>
> Ways to fix it:
> 1. In load_balance() restore old env.loop in case no tasks were moved
> by move_task_groups()
> 2. Add a check in detach_tasks() to exit without LBF_NEED_BREAK flag in
> case we have scanned all tasks and have not found movable tasks.
>
> Current patch implements the second way.
>
> Caused by ms commit: b0defa7ae03e ("sched/fair: Make sure to try to
> detach at least one movable task")
>
> Fixes ms commit: bca010328248 ("sched: Port CONFIG_CFS_CPULIMIT feature")
>
> Signed-off-by: Konstantin Khorenko <khorenko at virtuozzo.com>
>
> Feature: sched: ability to limit number of CPUs available to a CT
> ---
> kernel/sched/fair.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 9367d16a8d85..e068bb90f197 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -8771,6 +8771,12 @@ static int detach_tasks(struct lb_env *env)
> if (env->loop > env->loop_max &&
> !(env->flags & LBF_ALL_PINNED))
> break;
> + /*
> + * Quit if we have scanned all tasks even in case we haven't
> + * found any movable task.
> + */
> + if (env->loop > env->src_rq->nr_running)
> + break;
>
> /* take a breather every nr_migrate tasks */
> if (env->loop > env->loop_break) {
LGTM.
--
Regards,
Alexander Atanasov
More information about the Devel
mailing list