[CRIU] [PATCHv3 2/3] restore: Fix deadlock when helper's child dies
Andrei Vagin
avagin at virtuozzo.com
Thu Jul 20 10:39:49 MSK 2017
On Wed, Jul 19, 2017 at 05:02:53PM +0300, Dmitry Safonov wrote:
> Since commit ced9c529f687 ("restore: fix race with helpers' kids dying
> too early"), we block SIGCHLD in helper tasks before CR_STATE_RESTORE.
> This way we avoided default criu sighandler as it doesn't expect that
> childs may die.
>
> This is very racy as we wait on futex for another stage to be started,
> but the next stage may start only when all the tasks complete previous
> stage. If some children of helper dies, the helper may already have
> blocked SIGCHLD and have started sleeping on the futex. Then the next
> stage never comes and no one reads a pending SIGCHLD for helper.
>
> A customer met this situation on the node, where the following
> (non-related) problem has occured:
> Unable to send a fin packet: libnet_write_raw_ipv6(): -1 bytes written (Network is unreachable)
> Then child criu of the helper has exited with error-code and the
> lockup has happened.
>
> While we could fix it by aborting futex in the end of
> restore_task_with_children() for each (non-root also) tasks,
> that would be not completely correct:
> 1. All futex-waiting tasks will wake up after that and they
> may not expect that some tasks are on the previous stage,
> so they will spam into logs with unrelated errors and may
> also die painfully.
> 2. Child may die and miss aborting of the futex due to:
> o segfault
> o OOM killer
> o User-sended SIGKILL
> o Other error-path we forgot to cover with abort futex
>
> To fix this deadlock in TASK_HELPER, as suggested-by Kirill,
> let's check if there are children deaths expected - if there
> isn't any, don't block SIGCHLD, otherwise wait() and check if
> death was on expected stage of restore (not CR_STATE_RESTORE).
>
> Signed-off-by: Dmitry Safonov <dsafonov at virtuozzo.com>
> ---
> criu/cr-restore.c | 75 +++++++++++++++++++++++++++++++++++++++----------------
> 1 file changed, 53 insertions(+), 22 deletions(-)
>
> diff --git a/criu/cr-restore.c b/criu/cr-restore.c
> index 9143cd310fed..21c5fbf3eacc 100644
> --- a/criu/cr-restore.c
> +++ b/criu/cr-restore.c
> @@ -1207,6 +1207,58 @@ out:
> return ret;
> }
>
> +static bool child_death_expected(void)
> +{
> + struct pstree_item *pi;
> +
> + list_for_each_entry(pi, ¤t->children, sibling) {
> + switch (pi->pid->state) {
> + case TASK_DEAD:
> + case TASK_HELPER:
> + return true;
> + }
> + }
> +
> + return false;
> +}
> +
Could you write a comment for this function?
> +static int restore_one_helper(void)
> +{
> + sigset_t oldmask;
> + siginfo_t info;
> +
> + if (prepare_fds(current))
> + return -1;
> +
> + if (!child_death_expected()) {
> + restore_finish_stage(task_entries, CR_STATE_RESTORE);
> + return 0;
> + }
> +
> + if (block_sigmask(&oldmask, SIGCHLD))
block_sigmask(NULL, SIGCHLD) has to work too, oldmask isn't used
> + return -1;
> +
> + /* Finish CR_STATE_RESTORE, but do not wait for the next stage. */
> + futex_dec_and_wake(&task_entries->nr_in_progress);
> +
> + if (waitid(P_ALL, 0, &info, WEXITED | WNOWAIT)) {
> + pr_perror("Failed to wait\n");
> + return -1;
> + }
> +
> + if (futex_get(&task_entries->start) == CR_STATE_RESTORE) {
> + pr_err("Child %d died too early\n", info.si_pid);
> + return -1;
> + }
> +
> + if (wait_on_helpers_zombies()) {
> + pr_err("Failed to wait on helpers and zombies\n");
> + return -1;
> + }
> +
> + return 0;
> +}
> +
> static int restore_one_task(int pid, CoreEntry *core)
> {
> int i, ret;
> @@ -1218,32 +1270,11 @@ static int restore_one_task(int pid, CoreEntry *core)
> else if (current->pid->state == TASK_DEAD)
> ret = restore_one_zombie(core);
> else if (current->pid->state == TASK_HELPER) {
> - sigset_t blockmask, oldmask;
> -
> - if (prepare_fds(current))
> - return -1;
> -
> - sigemptyset(&blockmask);
> - sigaddset(&blockmask, SIGCHLD);
> -
> - if (sigprocmask(SIG_BLOCK, &blockmask, &oldmask) == -1) {
> - pr_perror("Can not set mask of blocked signals");
> - return -1;
> - }
> -
> - restore_finish_stage(task_entries, CR_STATE_RESTORE);
> -
> + ret = restore_one_helper();
> close_image_dir();
> close_proc();
> for (i = SERVICE_FD_MIN + 1; i < SERVICE_FD_MAX; i++)
> close_service_fd(i);
> -
> - if (wait_on_helpers_zombies()) {
> - pr_err("failed to wait on helpers and zombies\n");
> - ret = -1;
> - } else {
> - ret = 0;
> - }
> } else {
> pr_err("Unknown state in code %d\n", (int)core->tc->task_state);
> ret = -1;
> --
> 2.13.1
>
> _______________________________________________
> CRIU mailing list
> CRIU at openvz.org
> https://lists.openvz.org/mailman/listinfo/criu
More information about the CRIU
mailing list