[CRIU] [PATCH 6/7] restore: fix race with helpers' kids dying too early
Pavel Emelyanov
xemul at virtuozzo.com
Tue Jun 28 05:52:49 PDT 2016
On 06/23/2016 06:13 PM, Tycho Andersen wrote:
> We masked off SIGCHLD in wait_on_helpers_zombies(), but in fact this is too
> late: zombies can die any time after CR_STATE_RESTORE before this function
> is called, which lead to us getting "unexpected" deaths. Instead, we should
> mask off SIGCHLD before the helpers finish CR_STATE_RESTORE, since they're
> explicitly going to wait on all their kids to make sure they don't die
> anyway.
>
> Signed-off-by: Tycho Andersen <tycho.andersen at canonical.com>
> ---
> criu/cr-restore.c | 24 ++++++++++--------------
> 1 file changed, 10 insertions(+), 14 deletions(-)
>
> diff --git a/criu/cr-restore.c b/criu/cr-restore.c
> index 3651f54..771ea50 100644
> --- a/criu/cr-restore.c
> +++ b/criu/cr-restore.c
> @@ -624,15 +624,6 @@ static unsigned long task_entries_pos;
> static int wait_on_helpers_zombies(void)
> {
> struct pstree_item *pi;
> - sigset_t blockmask, oldmask;
> -
> - sigemptyset(&blockmask);
> - sigaddset(&blockmask, SIGCHLD);
> -
> - if (sigprocmask(SIG_BLOCK, &blockmask, &oldmask) == -1) {
> - pr_perror("Can not set mask of blocked signals");
> - return -1;
> - }
>
> list_for_each_entry(pi, ¤t->children, sibling) {
> pid_t pid = pi->pid.virt;
> @@ -654,11 +645,6 @@ static int wait_on_helpers_zombies(void)
> }
> }
>
> - if (sigprocmask(SIG_SETMASK, &oldmask, NULL) == -1) {
> - pr_perror("Can not unset mask of blocked signals");
> - BUG();
> - }
> -
> return 0;
> }
>
> @@ -746,6 +732,16 @@ static int restore_one_task(int pid, CoreEntry *core)
> else if (current->pid.state == TASK_DEAD)
> ret = restore_one_zombie(core);
> else if (current->pid.state == TASK_HELPER) {
> + sigset_t blockmask, oldmask;
> +
> + sigemptyset(&blockmask);
> + sigaddset(&blockmask, SIGCHLD);
> +
> + if (sigprocmask(SIG_BLOCK, &blockmask, &oldmask) == -1) {
> + pr_perror("Can not set mask of blocked signals");
> + return -1;
> + }
> +
Would this kill the --unshare pid functionality? The helper_cb below() is supposed
to pretend to be pidns reaper and has to wait() for all the dying stuff in ns.
> restore_finish_stage(CR_STATE_RESTORE);
> if (rsti(current)->helper_cb)
> rsti(current)->helper_cb();
>
More information about the CRIU
mailing list