[Devel] Re: [ckrm-tech] [PATCH 1/7] containers (V7): Generic container system abstracted from cpusets code
Srivatsa Vaddagiri
vatsa at in.ibm.com
Sat Mar 24 19:28:16 PDT 2007
On Sat, Mar 24, 2007 at 06:41:28PM -0700, Paul Jackson wrote:
> > the following code becomes racy with cpuset_exit() ...
> >
> > atomic_inc(&cs->count);
> > rcu_assign_pointer(tsk->cpuset, cs);
> > task_unlock(tsk);
>
> eh ... so ... ?
>
> I don't know of any sequence where that causes any problem.
>
> Do you see one?
Let's say we had two cpusets CS1 amd CS2 (both different from top_cpuset).
CS1 has just one task T1 in it (CS1->count = 0) while CS2 has no tasks
in it (CS2->count = 0).
Now consider:
--------------------------------------------------------------------
CPU0 (attach_task T1 to CS2) CPU1 (T1 is exiting)
--------------------------------------------------------------------
task_lock(T1);
oldcs = tsk->cpuset;
[oldcs = CS1]
T1->flags & PF_EXITING? (No)
T1->flags = PF_EXITING;
atomic_inc(&CS2->count);
cpuset_exit()
cs = tsk->cpuset; (cs = CS1)
T1->cpuset = CS2;
T1->cpuset = &top_cpuset;
task_unlock(T1);
CS2 has one bogus count now (with no tasks in it), which may prevent it from
being removed/freed forever.
Not just this, continuing further we have more trouble:
--------------------------------------------------------------------
CPU0 (attach_task T1 to CS2) CPU1 (T1 is exiting)
--------------------------------------------------------------------
synchronize_rcu()
atomic_dec(&CS1->count);
[CS1->count = 0]
if atomic_dec_and_test(&oldcs->count))
[CS1->count = -1]
We now have CS1->count negative. Is that good? I am uncomfortable ..
We need a task_lock() in cpuset_exit to avoid this race.
--
Regards,
vatsa
More information about the Devel
mailing list