[Devel] Re: [RFC] [PATCH 1/2] cgroups: read-write lock CLONE_THREAD forking per threadgroup
Oleg Nesterov
oleg at redhat.com
Mon Mar 22 03:22:47 PDT 2010
On 01/17, Ben Blum wrote:
>
> On Tue, Jan 05, 2010 at 07:53:30PM +0100, Oleg Nesterov wrote:
> >
> > I don't understand how this can close the race with de_thread().
> > ...
>
> the race with the sighand is handled in the next patch, in attach_proc,
> not in this function.
OK. I didn't verify this, the patches don't apply to 2.6.32-rc, but this
doesn't matter. Please see below.
> > > + /* now try to find a sighand */
> > > + if (likely(tsk->sighand)) {
> > > + sighand = tsk->sighand;
> > > + } else {
> > > + sighand = ERR_PTR(-ESRCH);
> > > + /*
> > > + * tsk is exiting; try to find another thread in the group
> > > + * whose sighand pointer is still alive.
> > > + */
> > > + list_for_each_entry_rcu(p, &tsk->thread_group, thread_group) {
> > > + if (p->sighand) {
> > > + sighand = tsk->sighand;
> >
> > can't understand this "else {}" code... We hold tasklist, if the group
> > leader is dead (->sighand == NULL), then the whole thread group is
> > dead.
> >
> > Even if we had another thread with ->sighand != NULL, what is the point
> > of "if (unlikely(!thread_group_leader(tsk)))" check then?
>
> doesn't the group leader stay on the threadgroup list even when it dies?
> sighand can be null if the group leader has exited, but other threads
> are still running.
No, leader->sighand != NULL until all threads have exited.
Ben, I'd suggest you to redo these patches even if they are correct.
->sighand is not the right place for the mutex/locking
- it is per CLONE_SIGHAND, not per process
- we have to avoid the nasty and hard-to-test races with exec
- we have to play with sighand->count and I really dislike this.
this ->count is not just a reference counter, look at
unshare_sighand(). Yes, this is fake, but still.
Please use ->signal instead. By the lucky coincidence the lifetime rules
for (greatly misnamed) signal_struct were changed recently in -mm.
With the recent changes, it is always safe to use task->signal. It can't
be changed, can't go away, no need to bump the counter, no races, etc.
What do you think?
Oleg.
_______________________________________________
Containers mailing list
Containers at lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
More information about the Devel
mailing list