[CRIU] [PATCH 09/10] restore: Relax the FORKING stage

Pavel Emelyanov xemul at virtuozzo.com
Fri Apr 28 05:29:26 PDT 2017


Here's why:

This stage is needed to make sure all tasks have appeared
and did some actions (that are called before restore_finish_stage()).
With this description there's no need in involving criu process
in it, this stage is purely inter-tasks sync point.

Taking into account we do already make root task wait for others
to complete forking (it calls restore_wait_ther_tasks()) we may
rework this stage not to involve criu process in it.

Here's how:

So the criu task starts the forking stage, then goes waiting for
"inprogress tasks". The latter wait is purely about nr_in_progress
counter, thus there's no strict requirement that the stage remains
the same by the time criu is woken up.

Siad that, the root task waits for other tasks to finish forking,
does fini_restore_mntns() (already in the code), then switches
the stage to the next (the RESTORE one). Other tasks do normal
staging barrier. Criu task is not woken up as nr_in_progress
always remains >= 1.

The result is -2 context switches -- from root task to criu and
back -- which gives us good boost when restoring single task app.

Signed-off-by: Pavel Emelyanov <xemul at virtuozzo.com>
---
 criu/cr-restore.c       | 35 +++++++++++++++++++----------------
 criu/include/restorer.h | 15 +++++++++++++++
 2 files changed, 34 insertions(+), 16 deletions(-)

diff --git a/criu/cr-restore.c b/criu/cr-restore.c
index fe1b2f8..8494617 100644
--- a/criu/cr-restore.c
+++ b/criu/cr-restore.c
@@ -1576,31 +1576,38 @@ static int restore_task_with_children(void *_arg)
 		BUG();
 	}
 
+	timing_start(TIME_FORK);
+
 	if (create_children_and_session())
 		goto err;
 
+	timing_stop(TIME_FORK);
 
 	if (unmap_guard_pages(current))
 		goto err;
 
 	restore_pgid();
 
+	if (open_transport_socket())
+		return -1;
+
 	if (current->parent == NULL) {
 		/*
 		 * Wait when all tasks passed the CR_STATE_FORKING stage.
+		 * The stage was started by criu, but now it waits for
+		 * the CR_STATE_RESTORE to finish. See comment near the
+		 * CR_STATE_FORKING macro for details.
+		 *
 		 * It means that all tasks entered into their namespaces.
 		 */
 		restore_wait_other_tasks();
-
 		fini_restore_mntns();
+		__restore_switch_stage(CR_STATE_RESTORE);
+	} else {
+		if (restore_finish_stage(task_entries, CR_STATE_FORKING) < 0)
+			goto err;
 	}
 
-	if (open_transport_socket())
-		return -1;
-
-	if (restore_finish_stage(task_entries, CR_STATE_FORKING) < 0)
-		goto err;
-
 	if (restore_one_task(vpid(current), ca->core))
 		goto err;
 
@@ -1962,19 +1969,15 @@ static int restore_root_task(struct pstree_item *init)
 	if (ret)
 		goto out_kill;
 
-	timing_start(TIME_FORK);
-
 	ret = restore_switch_stage(CR_STATE_FORKING);
 	if (ret < 0)
 		goto out_kill;
 
-	timing_stop(TIME_FORK);
-
-	ret = restore_switch_stage(CR_STATE_RESTORE);
-	if (ret < 0)
-		goto out_kill;
-
-	/* Zombies die after CR_STATE_RESTORE */
+	/*
+	 * Zombies die after CR_STATE_RESTORE which is switched
+	 * by root task, not by us. See comment before CR_STATE_FORKING
+	 * in the header for details.
+	 */
 	for_each_pstree_item(item) {
 		if (item->pid->state == TASK_DEAD)
 			task_entries->nr_threads--;
diff --git a/criu/include/restorer.h b/criu/include/restorer.h
index ebfd1bd..65e0eb5 100644
--- a/criu/include/restorer.h
+++ b/criu/include/restorer.h
@@ -227,6 +227,21 @@ enum {
 	/*
 	 * All tasks fork and call open_transport_socket().
 	 * Stage is needed to make sure they all have the socket.
+	 * Also this stage is a sync point after which the
+	 * fini_restore_mntns() can be called.
+	 *
+	 * This stage is a little bit special. Normally all stages
+	 * are controlled by criu process, but when this stage
+	 * starts criu process starts waiting for the tasks to
+	 * finish it, but by the time it gets woken up the stage
+	 * finished is CR_STATE_RESTORE. The forking stage is
+	 * barrier-ed by the root task, this task is also the one
+	 * that switches the stage (into restoring).
+	 *
+	 * The above is done to lower the amount of context
+	 * switches from root task to criu and back, since the
+	 * separate forking stage is not needed by criu, it's
+	 * purely to make sure all tasks be in sync.
 	 */
 	CR_STATE_FORKING,
 	/*
-- 
2.1.4



More information about the CRIU mailing list