[Devel] [PATCH vz9 2/2] sched/fair: use list_del_init() for cfs_rq_node on dequeue
Pavel Tikhomirov
ptikhomirov at virtuozzo.com
Wed Mar 18 13:10:41 MSK 2026
On 3/13/26 20:35, Konstantin Khorenko wrote:
> Use list_del_init() instead of list_del() when removing
> se->cfs_rq_node in account_entity_dequeue(). This mirrors
> the existing pattern used for se->group_node on the line above.
This comparison with se->group_node is incorrect. We use list_del_init() on se->group_node for a good reason, as se->group_node is accessed from se outside of list walk, where se->cfs_rq_node is only accessed through cfs_rq->tasks list walk, thus we know for sure that se->fs_rq_node is always initialized when we access it.
>
> list_del() poisons the prev/next pointers with LIST_POISON values.
> If the sched_entity is later accessed after the cfs_rq is freed
> (e.g. due to a stale timer or other use-after-free scenario), the
> poisoned pointers cause an immediate hard fault. While this is
> useful for debugging, it makes recovery impossible.
BUT, We don't access se->cfs_rq_node from timer handler.
>
> list_del_init() reinitializes the node to point to itself, so
> list_empty() checks on the freed node return true rather than
> dereferencing poisoned memory. This provides a safer default and
> makes the active_timer callback's list_empty(&cfs_rq->tasks)
> check return a benign result even in error scenarios.
>
> This is a defense-in-depth hardening complementary to the
> active_timer cancellation fix.
I think this patch is excess, if we have preexisting memory corruption we don't really want to recover, we want to detect corruption. So if we somehow end up seeing poisoned pointers in list we at least see a kernel warning about it, this can help us debug the issue, instead of silently hiding the issue with reinitialized list.
>
> https://virtuozzo.atlassian.net/browse/VSTOR-126785
>
> Signed-off-by: Konstantin Khorenko <khorenko at virtuozzo.com>
>
> Feature: sched: ability to limit number of CPUs available to a CT
> ---
> kernel/sched/fair.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 9b0fe4c8a272f..8ed4cfa0dc83e 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -3298,7 +3298,7 @@ account_entity_dequeue(struct cfs_rq *cfs_rq, struct sched_entity *se)
> account_numa_dequeue(rq_of(cfs_rq), task_of(se));
> list_del_init(&se->group_node);
> #ifdef CONFIG_CFS_CPULIMIT
> - list_del(&se->cfs_rq_node);
> + list_del_init(&se->cfs_rq_node);
> #endif
> }
> #endif
--
Best regards, Pavel Tikhomirov
Senior Software Developer, Virtuozzo.
More information about the Devel
mailing list