[CRIU] [PATCH 6/7] restore: fix race with helpers' kids dying too early

Tycho Andersen tycho.andersen at canonical.com
Tue Jun 28 07:57:52 PDT 2016


On Tue, Jun 28, 2016 at 03:52:49PM +0300, Pavel Emelyanov wrote:
> On 06/23/2016 06:13 PM, Tycho Andersen wrote:
> > We masked off SIGCHLD in wait_on_helpers_zombies(), but in fact this is too
> > late: zombies can die any time after CR_STATE_RESTORE before this function
> > is called, which lead to us getting "unexpected" deaths. Instead, we should
> > mask off SIGCHLD before the helpers finish CR_STATE_RESTORE, since they're
> > explicitly going to wait on all their kids to make sure they don't die
> > anyway.
> > 
> > Signed-off-by: Tycho Andersen <tycho.andersen at canonical.com>
> > ---
> >  criu/cr-restore.c | 24 ++++++++++--------------
> >  1 file changed, 10 insertions(+), 14 deletions(-)
> > 
> > diff --git a/criu/cr-restore.c b/criu/cr-restore.c
> > index 3651f54..771ea50 100644
> > --- a/criu/cr-restore.c
> > +++ b/criu/cr-restore.c
> > @@ -624,15 +624,6 @@ static unsigned long task_entries_pos;
> >  static int wait_on_helpers_zombies(void)
> >  {
> >  	struct pstree_item *pi;
> > -	sigset_t blockmask, oldmask;
> > -
> > -	sigemptyset(&blockmask);
> > -	sigaddset(&blockmask, SIGCHLD);
> > -
> > -	if (sigprocmask(SIG_BLOCK, &blockmask, &oldmask) == -1) {
> > -		pr_perror("Can not set mask of blocked signals");
> > -		return -1;
> > -	}
> >  
> >  	list_for_each_entry(pi, &current->children, sibling) {
> >  		pid_t pid = pi->pid.virt;
> > @@ -654,11 +645,6 @@ static int wait_on_helpers_zombies(void)
> >  		}
> >  	}
> >  
> > -	if (sigprocmask(SIG_SETMASK, &oldmask, NULL) == -1) {
> > -		pr_perror("Can not unset mask of blocked signals");
> > -		BUG();
> > -	}
> > -
> >  	return 0;
> >  }
> >  
> > @@ -746,6 +732,16 @@ static int restore_one_task(int pid, CoreEntry *core)
> >  	else if (current->pid.state == TASK_DEAD)
> >  		ret = restore_one_zombie(core);
> >  	else if (current->pid.state == TASK_HELPER) {
> > +		sigset_t blockmask, oldmask;
> > +
> > +		sigemptyset(&blockmask);
> > +		sigaddset(&blockmask, SIGCHLD);
> > +
> > +		if (sigprocmask(SIG_BLOCK, &blockmask, &oldmask) == -1) {
> > +			pr_perror("Can not set mask of blocked signals");
> > +			return -1;
> > +		}
> > +
> 
> Would this kill the --unshare pid functionality? The helper_cb below() is supposed
> to pretend to be pidns reaper and has to wait() for all the dying stuff in ns.

I don't think so, the fake init doesn't rely on SIGCHLD, it just keeps
wait()ing in a loop, so it should be ok.

> >  		restore_finish_stage(CR_STATE_RESTORE);
> >  		if (rsti(current)->helper_cb)
> >  			rsti(current)->helper_cb();
> > 
> 


More information about the CRIU mailing list