[Devel] [PATCH 07/17] Port diff-sched-introduce-cond_resched_may_throttle

Vladimir Davydov vdavydov at parallels.com
Thu Sep 3 03:14:43 PDT 2015


On Thu, Sep 03, 2015 at 11:56:57AM +0300, Kirill Tkhai wrote:
...
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 167d0f66387b..49a93569b5e6 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -965,7 +965,8 @@ static inline void update_entity_boost(struct sched_entity *se)
> >  					p->woken_while_running;
> >  			p->woken_while_running = 0;
> >  		} else
> > -			se->boosted = sched_feat(BOOST_PREEMPT);
> > +			se->boosted = sched_feat(BOOST_PREEMPT) &&
> > +					!p->may_throttle;
> 
> This function is unsafe, because there is case where put_prev_task() is made
> before dequeue_task() in scheduler. This lead to we clear boosted flag and leave
> se linked in boosted list, while it mustn't be there.

AFAICS the only such a place is cpu offline handling, so this is an
extremely rare case. Anyway, it's a separate issue, which has to be
fixed separately.

> 
> I suggest to unconditionally dequeue bootsted entities, where you do that,
> without checking for entity_boosted(). Of course, you need to use list_del_init()
> instead of plain list_del(). In this case you will be able to change se->boosted
> flag like you want in this function.

I'd prefer to just skip updating ->boosted flag when p != current. In
fact, when this code was initially written I simply overlooked the fact
that update_entity_boost may be called from another task's context (like
from setprio). Doing this is completely wrong, because checking
preempt_count or altering a bit field on task_struct must only be done
from the current context.

Here it goes:
---
From: Vladimir Davydov <vdavydov at parallels.com>
Date: Thu, 3 Sep 2015 12:46:53 +0300
Subject: [PATCH] sched/fair: only update current task boost

put_prev_task, which calls update_entity_boost, may be invoked for prev
!= current (e.g. on setprio). In rare cases (e.g. cpu offline), this is
done before dequeue_task. As a result, a task might not get removed from
the list of boosted entity on dequeue, leading to a crash.

Anyway, it only makes sense to call update_entity_boost on current task,
because it relies on preempt_count and flips task_struct bit fields. So
insert the appropriate check to this function in order to fix this
issue.

Reported-by: Kirill Tkhai <ktkhai at odin.com>
Signed-off-by: Vladimir Davydov <vdavydov at parallels.com>

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 49a93569b5e6..f7e8ba7df643 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -960,6 +960,9 @@ static inline void update_entity_boost(struct sched_entity *se)
 	else {
 		struct task_struct *p = task_of(se);
 
+		if (unlikely(p != current))
+			return;
+
 		if (!(preempt_count() & PREEMPT_ACTIVE)) {
 			se->boosted = sched_feat(BOOST_WAKEUPS) &&
 					p->woken_while_running;



More information about the Devel mailing list