[CRIU] [PATCH 2/4] restore: TASK_HELPERs live until RESTORE stage

Tycho Andersen tycho.andersen at canonical.com
Fri Sep 12 14:09:24 PDT 2014


On Sat, Sep 13, 2014 at 12:21:06AM +0400, Andrew Vagin wrote:
> On Fri, Sep 12, 2014 at 01:47:11PM -0500, Tycho Andersen wrote:
> > On Fri, Sep 12, 2014 at 10:35:00PM +0400, Andrew Vagin wrote:
> > > On Fri, Sep 12, 2014 at 01:13:00PM -0500, Tycho Andersen wrote:
> > > > In order to use TASK_HELPERS to open files from dead processes, they should
> > > > persist until the end of the restore stage, so that the /proc files exist when
> > > > setting up the fds.
> > > > 
> > > > This commit is in preparation for the remap_dead_pid commits.
> > > > 
> > > > v2: wait() on helpers after restore stage is over
> > > 
> > > [root at avagin-fc19-cr criu]#  bash test/zdtm.sh  ns/static/session00
> > > ================================= CRIU CHECK =================================
> > > Error (timerfd.c:56): timerfd: No timerfd support for c/r: Inappropriate ioctl for device
> > > ============================= WARNING =============================
> > > Not all features needed for CRIU are merged to upstream kernel yet,
> > > so for now we maintain our own branch which can be cloned from:
> > > git://git.kernel.org/pub/scm/linux/kernel/git/gorcunov/linux-cr.git
> > > ===================================================================
> > > Execute zdtm/live/static/session00
> > > ./session00 --pidfile=session00.pid --outfile=session00.out
> > > /root/git/criu/test
> > > Dump 11182
> > > Restore
> > > test/zdtm.sh: line 564: 11220 Segmentation fault      setsid $CRIU restore -D $ddump -o restore.log -v4 -d $gen_args
> > > Test: zdtm/live/static/session00, Result: FAIL
> > > ==================================== ERROR ====================================
> > > Test: zdtm/live/static/session00, Namespace: 1
> > > Dump log   : /root/git/criu/test/dump/static/session00/11182/1/dump.log
> > > --------------------------------- grep Error ---------------------------------
> > > ------------------------------------- END -------------------------------------
> > > Restore log: /root/git/criu/test/dump/static/session00/11182/1/restore.log
> > > --------------------------------- grep Error ---------------------------------
> > > (00.162581) Error (cr-restore.c:1738): BUG at cr-restore.c:1738
> > > ------------------------------------- END -------------------------------------
> > > ================================= ERROR OVER =================================
> > 
> > :( I guess this is with all the patches applied? Must still be some
> > synchronization issue, I will take a look.
> 
> Look at stage_participants(). I think you smth like this:
>         case CR_STATE_RESTORE:
> +               return task_entries->nr_threads + task_entries->nr_helpers;

Ah, looks like that line got merged to patch 3 in the series when I
rebased instead of this one, my mistake.

Tycho

> > 
> > Tycho
> > 
> > > > 
> > > > Signed-off-by: Tycho Andersen <tycho.andersen at canonical.com>
> > > > ---
> > > >  cr-restore.c | 13 +++++++------
> > > >  1 file changed, 7 insertions(+), 6 deletions(-)
> > > > 
> > > > diff --git a/cr-restore.c b/cr-restore.c
> > > > index 4d5ccd5..75d3afa 100644
> > > > --- a/cr-restore.c
> > > > +++ b/cr-restore.c
> > > > @@ -702,7 +702,7 @@ static int pstree_wait_helpers()
> > > >  {
> > > >  	struct pstree_item *pi;
> > > >  
> > > > -	list_for_each_entry(pi, &current->children, sibling) {
> > > > +	for_each_pstree_item(pi) {
> > > >  		int status, ret;
> > > >  
> > > >  		if (pi->state != TASK_HELPER)
> > > > @@ -770,9 +770,6 @@ static int restore_one_alive_task(int pid, CoreEntry *core)
> > > >  
> > > >  	rst_mem_switch_to_private();
> > > >  
> > > > -	if (pstree_wait_helpers())
> > > > -		return -1;
> > > > -
> > > >  	if (prepare_fds(current))
> > > >  		return -1;
> > > >  
> > > > @@ -931,9 +928,10 @@ static int restore_one_task(int pid, CoreEntry *core)
> > > >  		ret = restore_one_alive_task(pid, core);
> > > >  	else if (current->state == TASK_DEAD)
> > > >  		ret = restore_one_zombie(pid, core);
> > > > -	else if (current->state == TASK_HELPER)
> > > > +	else if (current->state == TASK_HELPER) {
> > > > +		restore_finish_stage(CR_STATE_RESTORE);
> > > >  		ret = 0;
> > > > -	else {
> > > > +	} else {
> > > >  		pr_err("Unknown state in code %d\n", (int)core->tc->task_state);
> > > >  		ret = -1;
> > > >  	}
> > > > @@ -1711,6 +1709,9 @@ static int restore_root_task(struct pstree_item *init)
> > > >  	if (ret < 0)
> > > >  		goto out_kill;
> > > >  
> > > > +	if (pstree_wait_helpers() < 0)
> > > > +		goto out_kill;
> > > > +
> > > >  	ret = run_scripts(ACT_POST_RESTORE);
> > > >  	if (ret != 0) {
> > > >  		pr_err("Aborting restore due to script ret code %d\n", ret);
> > > > -- 
> > > > 1.9.1
> > > > 
> > > > _______________________________________________
> > > > CRIU mailing list
> > > > CRIU at openvz.org
> > > > https://lists.openvz.org/mailman/listinfo/criu


More information about the CRIU mailing list