[Devel] [RFC][PATCH][cr]: Mark ghost tasks as detached earlier
Sukadev Bhattiprolu
sukadev at linux.vnet.ibm.com
Sat Oct 30 00:01:51 PDT 2010
>From ce9dd2fc7332597d46872f3f8c52ac0806f381d1 Mon Sep 17 00:00:00 2001
From: Sukadev Bhattiprolu <sukadev at linux.vnet.ibm.com>
Date: Fri, 29 Oct 2010 23:16:10 -0700
Subject: [PATCH 1/1] Mark ghost task as detached earlier
During restart() of an application, ghost tasks are be marked as "detached"
so they don't send a SIGCHLD to their parent when they exit. But this is
currently being done a little too late in the "life" of the ghost and
ends up confusing the container-init.
Suppose a ghost child of the container-init is waiting in do_ghost_task().
It is not yet detached. If the container-init is terminated for some
reason, the container-init sends SIGKILL to its children (including this
ghost). The container-init then waits for the un-detached children to
exit, expecting to be notified via SIGCHLD.
When the ghost-child receives the SIGKILL, it wakes up and marks itself
detached and proceeds to exit. Since it is now detached, it will not
notify the parent, thus leaving the container-init blocked indefintely.
Some background:
When running some tests on the C/R code we ran into the problem of the
container-init not waiting for detached processes. This problem was
extensively discssued here:
http://lkml.org/lkml/2010/6/16/295
Eric Biederman had a fix for the problem:
http://lkml.org/lkml/2010/7/12/213
When I applied this fix to the C/R tree and repeated the tests, I ran
into the above issue of the container-init hanging. Marking the ghost
as detached earlier seems to fix the confusion in the container-init.
Oren, is there a reason not to mark the ghost task detached earlier
than is currently being done ?
Signed-off-by: Sukadev Bhattiprolu (sukadev at us.ibm.com)
---
kernel/checkpoint/restart.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/kernel/checkpoint/restart.c b/kernel/checkpoint/restart.c
index 17270b8..95789c0 100644
--- a/kernel/checkpoint/restart.c
+++ b/kernel/checkpoint/restart.c
@@ -953,6 +953,7 @@ static int do_ghost_task(void)
struct ckpt_ctx *ctx;
int ret;
+ current->exit_signal = -1;
ctx = wait_checkpoint_ctx();
if (IS_ERR(ctx))
return PTR_ERR(ctx);
@@ -972,7 +973,6 @@ static int do_ghost_task(void)
if (ret < 0)
ckpt_err(ctx, ret, "ghost restart failed\n");
- current->exit_signal = -1;
restore_debug_exit(ctx);
ckpt_ctx_put(ctx);
do_exit(0);
--
1.6.6.1
_______________________________________________
Containers mailing list
Containers at lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
More information about the Devel
mailing list