[Devel] Re: [RFC] [PATCH 1/2] cgroups: read-write lock CLONE_THREAD forking per threadgroup

Oleg Nesterov oleg at redhat.com
Mon Mar 22 03:22:47 PDT 2010


On 01/17, Ben Blum wrote:
>
> On Tue, Jan 05, 2010 at 07:53:30PM +0100, Oleg Nesterov wrote:
> >
> > I don't understand how this can close the race with de_thread().
> > ...
>
> the race with the sighand is handled in the next patch, in attach_proc,
> not in this function.

OK. I didn't verify this, the patches don't apply to 2.6.32-rc, but this
doesn't matter. Please see below.

> > > +	/* now try to find a sighand */
> > > +	if (likely(tsk->sighand)) {
> > > +		sighand = tsk->sighand;
> > > +	} else {
> > > +		sighand = ERR_PTR(-ESRCH);
> > > +		/*
> > > +		 * tsk is exiting; try to find another thread in the group
> > > +		 * whose sighand pointer is still alive.
> > > +		 */
> > > +		list_for_each_entry_rcu(p, &tsk->thread_group, thread_group) {
> > > +			if (p->sighand) {
> > > +				sighand = tsk->sighand;
> >
> > can't understand this "else {}" code... We hold tasklist, if the group
> > leader is dead (->sighand == NULL), then the whole thread group is
> > dead.
> >
> > Even if we had another thread with ->sighand != NULL, what is the point
> > of "if (unlikely(!thread_group_leader(tsk)))" check then?
>
> doesn't the group leader stay on the threadgroup list even when it dies?
> sighand can be null if the group leader has exited, but other threads
> are still running.

No, leader->sighand != NULL until all threads have exited.


Ben, I'd suggest you to redo these patches even if they are correct.
->sighand is not the right place for the mutex/locking

	- it is per CLONE_SIGHAND, not per process

	- we have to avoid the nasty and hard-to-test races with exec

	- we have to play with sighand->count and I really dislike this.
	  this ->count is not just a reference counter, look at
	  unshare_sighand(). Yes, this is fake, but still.

Please use ->signal instead. By the lucky coincidence the lifetime rules
for (greatly misnamed) signal_struct were changed recently in -mm.

With the recent changes, it is always safe to use task->signal. It can't
be changed, can't go away, no need to bump the counter, no races, etc.

What do you think?

Oleg.

_______________________________________________
Containers mailing list
Containers at lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers




More information about the Devel mailing list