[Devel] [PATCH vz9] sched: Do not set LBF_NEED_BREAK flag if scanned all the tasks

Alexander Atanasov alexander.atanasov at virtuozzo.com
Thu Dec 14 16:39:23 MSK 2023


On 14.12.23 0:01, Konstantin Khorenko wrote:
> After ms commit b0defa7ae03e ("sched/fair: Make sure to try to detach at
> least one movable task") detach_tasks() does not stop on the condition
> (env->loop > env->loop_max) in case no movable task found.
> 
> Instead of that (if there are no movable tasks in the rq) exits always
> happen on the loop_break check - thus with LBF_NEED_BREAK flag set.
> 
> It's not a problem for mainstream because load_balance() proceeds with
> balancing in case LBF_NEED_BREAK is set only in case (env.loop <
> busiest->nr_running), but in Virtuozzo kernel with CFS_CPULIMIT feature
> right before that we try to move a whole task group (object of the
> CFS_CPULIMIT feature) and resets env.loop to zero,
> so we get a livelock here.
> 
> Resetting env.loop makes sense in case we have successfully moved some
> tasks (in the scope of the task group), but if we failed to move any
> task, no progress is expected during further balancing attempts.
> 
> Ways to fix it:
>   1. In load_balance() restore old env.loop in case no tasks were moved
>      by move_task_groups()
>   2. Add a check in detach_tasks() to exit without LBF_NEED_BREAK flag in
>      case we have scanned all tasks and have not found movable tasks.
> 
> Current patch implements the second way.
> 
> Caused by ms commit: b0defa7ae03e ("sched/fair: Make sure to try to
> detach at least one movable task")
> 
> Fixes ms commit: bca010328248 ("sched: Port CONFIG_CFS_CPULIMIT feature")
> 
> Signed-off-by: Konstantin Khorenko <khorenko at virtuozzo.com>
> 
> Feature: sched: ability to limit number of CPUs available to a CT
> ---
>   kernel/sched/fair.c | 6 ++++++
>   1 file changed, 6 insertions(+)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 9367d16a8d85..e068bb90f197 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -8771,6 +8771,12 @@ static int detach_tasks(struct lb_env *env)
>   		if (env->loop > env->loop_max &&
>   		    !(env->flags & LBF_ALL_PINNED))
>   			break;
> +		/*
> +		 * Quit if we have scanned all tasks even in case we haven't
> +		 * found any movable task.
> +		 */
> +		if (env->loop > env->src_rq->nr_running)
> +			break;
>   
>   		/* take a breather every nr_migrate tasks */
>   		if (env->loop > env->loop_break) {


LGTM.


-- 
Regards,
Alexander Atanasov



More information about the Devel mailing list