[CRIU] [PATCH] restore: define root_as_sibling before using it
Tycho Andersen
tycho.andersen at canonical.com
Tue Sep 9 11:15:12 PDT 2014
root_as_sibling is used in criu_signals_setup(), but was only defined later
(when forking the root task for the first time). This meant that the
SA_NOCLDSTOP was never masked off, which meant SIGCHLD was never delivered
after ptracing the root task. Thus, when the a child of the root task died
(e.g. from cr_system), the root task sat in PTRACE_STOP, and the restore task
never PTRACE_CONT'd, resulting in a deadlock.
We also drop the pdeath_sig constraint from setting root_as_sibling when in
--restore-detached mode; in --restore-detached we /always/ need to have
root_as_sibling, but we only need to clone the parent if pdeath_sig is set and
we want to restore the task as alive.
v2: re-work the condition for CLONE_PARENT
Signed-off-by: Tycho Andersen <tycho.andersen at canonical.com>
---
cr-restore.c | 30 ++++++++++++++++++++----------
1 file changed, 20 insertions(+), 10 deletions(-)
diff --git a/cr-restore.c b/cr-restore.c
index 2735d0d..fe6c798 100644
--- a/cr-restore.c
+++ b/cr-restore.c
@@ -956,25 +956,21 @@ struct cr_clone_arg {
static void maybe_clone_parent(struct pstree_item *item,
struct cr_clone_arg *ca)
{
+ /*
+ * zdtm runs in kernel 3.11, which has the problem described below. We
+ * avoid this by including the pdeath_sig test. Once users/zdtm migrate
+ * off of 3.11, this condition can be simplified to just test
+ * root_as_sibling.
+ */
if (opts.swrk_restore ||
(opts.restore_detach && ca->core->thread_core->pdeath_sig)) {
/*
- * This means we're called from lib's criu_restore_child().
- * In that case create the root task as the child one to+
- * the caller. This is the only way to correctly restore the
- * pdeath_sig of the root task. But also looks nice.
- *
- * Alternatively, if we are --restore-detached, a similar trick is
- * needed to correctly restore pdeath_sig and prevent processes from
- * dying once restored.
- *
* There were a problem in kernel 3.11 -- CLONE_PARENT can't be
* set together with CLONE_NEWPID, which has been solved in further
* versions of the kernels, but we treat 3.11 as a base, so at
* least warn a user about potential problems.
*/
item->rst->clone_flags |= CLONE_PARENT;
- root_as_sibling = 1;
if (item->rst->clone_flags & CLONE_NEWPID)
pr_warn("Set CLONE_PARENT | CLONE_NEWPID but it might cause restore problem,"
"because not all kernels support such clone flags combinations!\n");
@@ -1792,6 +1788,20 @@ int cr_restore_tasks(void)
{
int ret = -1;
+ if (opts.swrk_restore || opts.restore_detach) {
+ /*
+ * This means we're called from lib's criu_restore_child().
+ * In that case create the root task as the child one to+
+ * the caller. This is the only way to correctly restore the
+ * pdeath_sig of the root task. But also looks nice.
+ *
+ * Alternatively, if we are --restore-detached, a similar trick is
+ * needed to correctly restore pdeath_sig and prevent processes from
+ * dying once restored.
+ */
+ root_as_sibling = 1;
+ }
+
if (cr_plugin_init())
return -1;
--
1.9.1
More information about the CRIU
mailing list