[CRIU] [PATCH 2/4] restore: TASK_HELPERs live until RESTORE stage
Andrew Vagin
avagin at parallels.com
Fri Sep 12 13:21:06 PDT 2014
On Fri, Sep 12, 2014 at 01:47:11PM -0500, Tycho Andersen wrote:
> On Fri, Sep 12, 2014 at 10:35:00PM +0400, Andrew Vagin wrote:
> > On Fri, Sep 12, 2014 at 01:13:00PM -0500, Tycho Andersen wrote:
> > > In order to use TASK_HELPERS to open files from dead processes, they should
> > > persist until the end of the restore stage, so that the /proc files exist when
> > > setting up the fds.
> > >
> > > This commit is in preparation for the remap_dead_pid commits.
> > >
> > > v2: wait() on helpers after restore stage is over
> >
> > [root at avagin-fc19-cr criu]# bash test/zdtm.sh ns/static/session00
> > ================================= CRIU CHECK =================================
> > Error (timerfd.c:56): timerfd: No timerfd support for c/r: Inappropriate ioctl for device
> > ============================= WARNING =============================
> > Not all features needed for CRIU are merged to upstream kernel yet,
> > so for now we maintain our own branch which can be cloned from:
> > git://git.kernel.org/pub/scm/linux/kernel/git/gorcunov/linux-cr.git
> > ===================================================================
> > Execute zdtm/live/static/session00
> > ./session00 --pidfile=session00.pid --outfile=session00.out
> > /root/git/criu/test
> > Dump 11182
> > Restore
> > test/zdtm.sh: line 564: 11220 Segmentation fault setsid $CRIU restore -D $ddump -o restore.log -v4 -d $gen_args
> > Test: zdtm/live/static/session00, Result: FAIL
> > ==================================== ERROR ====================================
> > Test: zdtm/live/static/session00, Namespace: 1
> > Dump log : /root/git/criu/test/dump/static/session00/11182/1/dump.log
> > --------------------------------- grep Error ---------------------------------
> > ------------------------------------- END -------------------------------------
> > Restore log: /root/git/criu/test/dump/static/session00/11182/1/restore.log
> > --------------------------------- grep Error ---------------------------------
> > (00.162581) Error (cr-restore.c:1738): BUG at cr-restore.c:1738
> > ------------------------------------- END -------------------------------------
> > ================================= ERROR OVER =================================
>
> :( I guess this is with all the patches applied? Must still be some
> synchronization issue, I will take a look.
Look at stage_participants(). I think you smth like this:
case CR_STATE_RESTORE:
+ return task_entries->nr_threads + task_entries->nr_helpers;
>
> Tycho
>
> > >
> > > Signed-off-by: Tycho Andersen <tycho.andersen at canonical.com>
> > > ---
> > > cr-restore.c | 13 +++++++------
> > > 1 file changed, 7 insertions(+), 6 deletions(-)
> > >
> > > diff --git a/cr-restore.c b/cr-restore.c
> > > index 4d5ccd5..75d3afa 100644
> > > --- a/cr-restore.c
> > > +++ b/cr-restore.c
> > > @@ -702,7 +702,7 @@ static int pstree_wait_helpers()
> > > {
> > > struct pstree_item *pi;
> > >
> > > - list_for_each_entry(pi, ¤t->children, sibling) {
> > > + for_each_pstree_item(pi) {
> > > int status, ret;
> > >
> > > if (pi->state != TASK_HELPER)
> > > @@ -770,9 +770,6 @@ static int restore_one_alive_task(int pid, CoreEntry *core)
> > >
> > > rst_mem_switch_to_private();
> > >
> > > - if (pstree_wait_helpers())
> > > - return -1;
> > > -
> > > if (prepare_fds(current))
> > > return -1;
> > >
> > > @@ -931,9 +928,10 @@ static int restore_one_task(int pid, CoreEntry *core)
> > > ret = restore_one_alive_task(pid, core);
> > > else if (current->state == TASK_DEAD)
> > > ret = restore_one_zombie(pid, core);
> > > - else if (current->state == TASK_HELPER)
> > > + else if (current->state == TASK_HELPER) {
> > > + restore_finish_stage(CR_STATE_RESTORE);
> > > ret = 0;
> > > - else {
> > > + } else {
> > > pr_err("Unknown state in code %d\n", (int)core->tc->task_state);
> > > ret = -1;
> > > }
> > > @@ -1711,6 +1709,9 @@ static int restore_root_task(struct pstree_item *init)
> > > if (ret < 0)
> > > goto out_kill;
> > >
> > > + if (pstree_wait_helpers() < 0)
> > > + goto out_kill;
> > > +
> > > ret = run_scripts(ACT_POST_RESTORE);
> > > if (ret != 0) {
> > > pr_err("Aborting restore due to script ret code %d\n", ret);
> > > --
> > > 1.9.1
> > >
> > > _______________________________________________
> > > CRIU mailing list
> > > CRIU at openvz.org
> > > https://lists.openvz.org/mailman/listinfo/criu
More information about the CRIU
mailing list