[Devel] [PATCH RH9] loadavg: Fix handling of cfs_rq->nr_unint values

Konstantin Khorenko khorenko at virtuozzo.com
Wed Dec 7 21:02:39 MSK 2022


BTW, do we need a similar patch for vz7 as well?
On 30.11.2022 13:29, Nikolay Borisov wrote:
> Our custom nr_unint logic follows the kernel's one about the global
> nr_uninterruptible one. One invariant that both share is that values
> of nr_unint can go negative when a task is migrated from a CPU.
> What's important is that when summed across all task groups they either result
> in a positive number (we have sleeping tasks in uninterruptible state) or 0
> (no uninterruptible processes).
> 
> Current code is broken in that the math in calc_load_ve always deals
> with unsigned integers values which leads to enormous values of
> 'nr_active' when ->nr_unint is negative on a tg. The proper way to work
> with nr_unint is to always cast it to an signed integer and also make
> nr_active a signed values as well to properly deal with negative values.
> 
> Signed-off-by: Nikolay Borisov <nikolay.borisov at virtuozzo.com>
> https://jira.sw.ru/browse/PSBM-143355
> ---
>   kernel/sched/loadavg.c | 14 +++++++++++---
>   1 file changed, 11 insertions(+), 3 deletions(-)
> 
> diff --git a/kernel/sched/loadavg.c b/kernel/sched/loadavg.c
> index 8a5fa39b8b7b..c0c6b89a4958 100644
> --- a/kernel/sched/loadavg.c
> +++ b/kernel/sched/loadavg.c
> @@ -114,7 +114,7 @@ extern raw_spinlock_t load_ve_lock;
> 
>   void calc_load_ve(void)
>   {
> -	unsigned long nr_unint, nr_active;
> +	long nr_unint, nr_active;
>   	struct task_group *tg;
>   	int i;
> 
> @@ -134,13 +134,21 @@ void calc_load_ve(void)
>   			 * overhead for activate/deactivate operations. So, we
>   			 * don't account child cgroup unint tasks here.
>   			 */
> -			nr_active += tg->cfs_rq[i]->nr_unint;
> +			nr_active += (int)tg->cfs_rq[i]->nr_unint;
>   #endif
>   #ifdef CONFIG_RT_GROUP_SCHED
>   			nr_active += tg->rt_rq[i]->rt_nr_running;
>   #endif
>   		}
> -		nr_active *= FIXED_1;
> +		/*
> +		 * Unless we screwed somewhere, nr_active must always be >= 0, because
> +		 * we can have no runnable tasks and no taks in uninterruptible sleep
> +		 */
> +		if (WARN_ON_ONCE(nr_active < 0)) {
> +			nr_active = 0;
> +		} else {
> +			nr_active *= FIXED_1;
> +		}
> 
>   		tg->avenrun[0] = calc_load(tg->avenrun[0], EXP_1, nr_active);
>   		tg->avenrun[1] = calc_load(tg->avenrun[1], EXP_5, nr_active);
> --
> 2.31.1
> 
> _______________________________________________
> Devel mailing list
> Devel at openvz.org
> https://lists.openvz.org/mailman/listinfo/devel


More information about the Devel mailing list