[CRIU] [PATCH] restore: define root_as_sibling before using it
Tycho Andersen
tycho.andersen at canonical.com
Tue Sep 9 10:20:21 PDT 2014
On Tue, Sep 09, 2014 at 08:56:26PM +0400, Pavel Emelyanov wrote:
> On 09/09/2014 06:46 PM, Tycho Andersen wrote:
> > root_as_sibling is used in criu_signals_setup(), but was only defined later
> > (when forking the root task for the first time). This meant that the
> > SA_NOCLDSTOP was never masked off, which meant SIGCHLD was never delivered
> > after ptracing the root task. Thus, when the a child of the root task died
> > (e.g. from cr_system), the root task sat in PTRACE_STOP, and the restore task
> > never PTRACE_CONT'd, resulting in a deadlock.
> >
> > We also drop the pdeath_sig constraint from setting root_as_sibling when in
> > --restore-detached mode; in --restore-detached we /always/ need to have
> > root_as_sibling, but we only need to clone the parent if pdeath_sig is set and
> > we want to restore the task as alive.
> >
> > Signed-off-by: Tycho Andersen <tycho.andersen at canonical.com>
> > ---
> > cr-restore.c | 27 +++++++++++++++------------
> > 1 file changed, 15 insertions(+), 12 deletions(-)
> >
> > diff --git a/cr-restore.c b/cr-restore.c
> > index 2735d0d..1e1d9e4 100644
> > --- a/cr-restore.c
> > +++ b/cr-restore.c
> > @@ -956,25 +956,14 @@ struct cr_clone_arg {
> > static void maybe_clone_parent(struct pstree_item *item,
> > struct cr_clone_arg *ca)
> > {
> > - if (opts.swrk_restore ||
> > - (opts.restore_detach && ca->core->thread_core->pdeath_sig)) {
> > + if (root_as_sibling && ca->core->thread_core->pdeath_sig) {
>
> This if looks wrong. If we restore from criu_restore_child() and the
> child doesn't have pdeath_sig we will end up forking the root task
> as criu's child and, after criu exits, it will get reparented to init,
> instead of sitting as the library caller's kid.
By library here I guess you mean the service? Can we just drop the &&
all together?
Tycho
> > /*
> > - * This means we're called from lib's criu_restore_child().
> > - * In that case create the root task as the child one to+
> > - * the caller. This is the only way to correctly restore the
> > - * pdeath_sig of the root task. But also looks nice.
> > - *
> > - * Alternatively, if we are --restore-detached, a similar trick is
> > - * needed to correctly restore pdeath_sig and prevent processes from
> > - * dying once restored.
> > - *
> > * There were a problem in kernel 3.11 -- CLONE_PARENT can't be
> > * set together with CLONE_NEWPID, which has been solved in further
> > * versions of the kernels, but we treat 3.11 as a base, so at
> > * least warn a user about potential problems.
> > */
> > item->rst->clone_flags |= CLONE_PARENT;
> > - root_as_sibling = 1;
> > if (item->rst->clone_flags & CLONE_NEWPID)
> > pr_warn("Set CLONE_PARENT | CLONE_NEWPID but it might cause restore problem,"
> > "because not all kernels support such clone flags combinations!\n");
> > @@ -1792,6 +1781,20 @@ int cr_restore_tasks(void)
> > {
> > int ret = -1;
> >
> > + if (opts.swrk_restore || opts.restore_detach) {
> > + /*
> > + * This means we're called from lib's criu_restore_child().
> > + * In that case create the root task as the child one to+
> > + * the caller. This is the only way to correctly restore the
> > + * pdeath_sig of the root task. But also looks nice.
> > + *
> > + * Alternatively, if we are --restore-detached, a similar trick is
> > + * needed to correctly restore pdeath_sig and prevent processes from
> > + * dying once restored.
> > + */
> > + root_as_sibling = 1;
> > + }
> > +
> > if (cr_plugin_init())
> > return -1;
> >
> >
>
More information about the CRIU
mailing list