[CRIU] [PATCHv3 3/3] fault-inj: Silently dying helper's child
Dmitry Safonov
dsafonov at virtuozzo.com
Wed Jul 19 17:02:54 MSK 2017
The restorer blob may die silently due to anything:
- Segmentation fault
- OOM killer
- User-sended SIGKILL
- Child CRIU restorer did't abort futex on error path (and exited)
We should terminate the restoring process and avoid locking
self up on waiting for died restoree.
Signed-off-by: Dmitry Safonov <dsafonov at virtuozzo.com>
---
criu/cr-restore.c | 21 ++++++++++++++++++++-
criu/include/fault-injection.h | 1 +
test/jenkins/criu-fault.sh | 1 +
3 files changed, 22 insertions(+), 1 deletion(-)
diff --git a/criu/cr-restore.c b/criu/cr-restore.c
index 21c5fbf3eacc..78dcdec7590e 100644
--- a/criu/cr-restore.c
+++ b/criu/cr-restore.c
@@ -3696,11 +3696,30 @@ static int sigreturn_restore(pid_t pid, struct task_restore_args *task_args, uns
task_args->clone_restore_fn,
task_args->thread_args);
+ if (fault_injected(FI_HELPER_CHILD_DIE)) {
+ struct task_entries *t = task_args->task_entries;
+ bool must_die = current->parent->pid->state == TASK_HELPER;
+
+ if (must_die)
+ pr_info("fault-injected: restorer %d will die\n", pid);
+
+ /*
+ * Restorer dies only when all helpers did current stage:
+ * Begin: nr_in_progress = nr_tasks + nr_helpers
+ * Exit on: nr_in_progress = nr_tasks
+ */
+ futex_wait_while_gt(&t->nr_in_progress, t->nr_tasks);
+
+ if (must_die) {
+ pr_info("fault-injected: %d exiting\n", pid);
+ exit(1);
+ }
+ }
+
/*
* An indirect call to task_restore, note it never returns
* and restoring core is extremely destructive.
*/
-
JUMP_TO_RESTORER_BLOB(new_sp, restore_task_exec_start, task_args);
err:
diff --git a/criu/include/fault-injection.h b/criu/include/fault-injection.h
index 46a5f71b031c..0da6bf8731c3 100644
--- a/criu/include/fault-injection.h
+++ b/criu/include/fault-injection.h
@@ -10,6 +10,7 @@ enum faults {
FI_RESTORE_OPEN_LINK_REMAP,
FI_PARASITE_CONNECT,
FI_POST_RESTORE,
+ FI_HELPER_CHILD_DIE,
/* not fatal */
FI_VDSO_TRAMPOLINES = 127,
FI_CHECK_OPEN_HANDLE = 128,
diff --git a/test/jenkins/criu-fault.sh b/test/jenkins/criu-fault.sh
index b7879116dc29..fbdf9b34ff03 100755
--- a/test/jenkins/criu-fault.sh
+++ b/test/jenkins/criu-fault.sh
@@ -21,3 +21,4 @@ prep
./test/zdtm.py run -t zdtm/static/env00 --fault 5 --keep-going --report report || fail
./test/zdtm.py run -t zdtm/static/maps04 --fault 131 --keep-going --report report --pre 2:1 || fail
./test/zdtm.py run -t zdtm/transition/maps008 --fault 131 --keep-going --report report --pre 2:1 || fail
+./test/zdtm.py run -t zdtm/static/session01 --fault 7 -f ns || fail
--
2.13.1
More information about the CRIU
mailing list